Model averaging for semiparametric varying coefficient quantile regression models

Zhan, Zishu; Li, Yang; Yang, Yuhong; Lin, Cunjie

doi:10.1007/s10463-022-00857-z

Model averaging for semiparametric varying coefficient quantile regression models

Published: 22 December 2022

Volume 75, pages 649–681, (2023)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Zishu Zhan¹,
Yang Li²,
Yuhong Yang³ &
…
Cunjie Lin²

462 Accesses
1 Citation
Explore all metrics

Abstract

In this study, we propose a model averaging approach to estimating the conditional quantiles based on a set of semiparametric varying coefficient models. Different from existing literature on the subject, we consider a particular form for all candidates, where there is only one varying coefficient in each sub-model, and all the candidates under investigation may be misspecified. We propose a weight choice criterion based on a leave-more-out cross-validation objective function. Moreover, the resulting averaging estimator is more robust against model misspecification due to the weighted coefficients that adjust the relative importance of the varying and constant coefficients for the same predictors. We prove out statistical properties for each sub-model and asymptotic optimality of the weight selection method. Simulation studies show that the proposed procedure has satisfactory prediction accuracy. An analysis of a skin cutaneous melanoma data further supports the merits of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical Likelihood Test for Regression Coefficients in High Dimensional Partially Linear Models

Article 07 November 2020

Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models

Article 20 April 2016

Jackknife model averaging for mixed-data kernel-weighted spline quantile regressions

Article 28 November 2023

References

Angrist, J., Chernozhukov, V., Fernández-Val, I. (2006). Quantile regression under misspecification, with an application to the U.S. wage structure. Econometrica, 74, 539–563.
Article MathSciNet MATH Google Scholar
Cai, Z., Xiao, Z. (2012). Semiparametric quantile regression estimation in dynamic models with partially varying coefficients. Journal of Econometrics, 167, 413–425.
Article MathSciNet MATH Google Scholar
Cai, Z., Xu, X. (2008). Nonparametric quantile estimations for dynamic smooth coefficient models. Journal of the American Statistical Association, 103, 1595–1608.
Article MathSciNet MATH Google Scholar
Cai, Z., Chen, L., Fang, Y. (2018). A semiparametric quantile panel data model with an application to estimating the growth effect of FDI. Journal of Econometrics, 206, 531–553.
Article MathSciNet MATH Google Scholar
Chai, H., Shi, X., Zhang, Q., Zhao, Q., Huang, Y., Ma, S. (2017). Analysis of cancer gene expression data with an assisted robust marker identification approach. Genetic Epidemiology, 41, 779–789.
Article Google Scholar
Fitzenberger, B., Koenker, R., Machado, J. (Eds.). (2002). Economic application of quantile regression. Heidelberg, Germany: Physica Verlag.
Google Scholar
Hansen, B. E. (2007). Least squares model averaging. Econometrica, 75, 1175–1189.
Article MathSciNet MATH Google Scholar
Hjort, N. L., Claeskens, G. (2003). Frequentist model average estimators. Journal of the American Statistical Association, 98, 879–899.
Article MathSciNet MATH Google Scholar
Kai, B., Li, R., Zou, H. (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. The Annals of Statistics, 39, 305–332.
Article MathSciNet MATH Google Scholar
Knight, K. (1998). Limiting distributions for $L_1$ regression estimators under general conditions. Annals of Statistics, 26, 755–770.
Article MathSciNet MATH Google Scholar
Koenker, R., Bassett, G. (1978). Regression quantiles. Econometrica, 46, 33–50.
Article MathSciNet MATH Google Scholar
Kuester, K., Mittnik, S., Paolella, M. (2006). Value-at-risk prediction: A comparison of alternative strategies. Journal of Financial Econometrics, 4, 53–89.
Article MATH Google Scholar
Li, D., Linton, O., Lu, Z. (2015). A flexible semiparametric forecasting model for time series. Journal of Econometrics, 187, 345–357.
Article MathSciNet MATH Google Scholar
Li, G., Li, Y., Tsai, C. L. (2015). Quantile correlations and quantile autoregressive modeling. Journal of the American Statistical Association, 110, 246–261.
Article MathSciNet MATH Google Scholar
Li, J., Xia, X., Wong, W. K., Nott, D. (2018). Varying-coefficient semiparametric model averaging prediction. Biometrics, 74, 1417–1426.
Article MathSciNet Google Scholar
Li, X., Ma, X., Zhang, J. (2018). Conditional quantile correlation screening procedure for ultrahigh-dimensional varying coefficient models. Journal of Statistical Planning and Inference, 197, 62–92.
Article MathSciNet MATH Google Scholar
Li, Y., Graubard, B. I., Korn, E. L. (2010). Application of nonparametric quantile regression to body mass index percentile curves from survey data. Statistics in Medicine, 29, 558–572.
Article MathSciNet Google Scholar
Lian, H. (2015). Quantile regression for dynamic partially linear varying coefficient time series models. Journal of Multivariate Analysis, 141, 49–66.
Article MathSciNet MATH Google Scholar
Lin, H., Fei, Z., Li, Y. (2016). A semiparametrically efficient estimator of the time-varying effects for survival data with time-dependent treatment. Scandinavian Journal of Statistics, 43, 649–663.
Article MathSciNet MATH Google Scholar
Liu, J., Huang, J., Zhang, Y., Lan, Q., Rothman, N., Zheng, T., Ma, S. (2013). Identification of gene-environment interactions in cancer studies using penalization. Genomics, 102, 189–194.
Article Google Scholar
Lu, X., Su, L. (2015). Jackknife model averaging for quantile regressions. Journal of Econometrics, 188, 40–58.
Article MathSciNet MATH Google Scholar
Ma, S., Yang, L., Romero, R., Cui, Y. (2011). Varying coefficient model for gene-environment interaction: A non-linear look. Bioinformatics, 27, 2119–2126.
Article Google Scholar
Mack, Y., Silverman, B. (1982). Weak and strong uniform consistency of kernel regression estimates. Probability Theory Related Fields, 61, 405–415.
MathSciNet MATH Google Scholar
Nan, Y., Yang, Y. (2014). Variable selection diagnostics measures for high-dimensional regression. Journal of Computational and Graphical Statistics, 23, 636–656.
Article MathSciNet Google Scholar
Shan, K., Yang, Y. (2009). Combining Regression Quantile Estimators. Statistica Sinica, 19, 1171–1191.
MathSciNet MATH Google Scholar
Sharafeldin, N., Slattery, M. L., Liu, Q., Franco-Villalobos, C., Caan, B. J., Potter, J. D., Yasui, Y. (2015). A candidate-pathway approach to identify gene-environment interactions: Analyses of colon cancer risk and survival. Journal of the National Cancer Institute, 107(9), djv160.
Shen, Y., Liang, H. (2017). Quantile regression for partially linear varying-coefficient model with censoring indicators missing at random. Computational Statistics and Data Analysis, 117, 1–18.
Article MathSciNet MATH Google Scholar
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. London: Chapman and Hall.
Stock, J., Watson, M. (2004). Combination forecasts of output growth in a seven-country data set. Journal of Forecasting, 23, 405–430.
Article Google Scholar
Van der Vaart, A., Wellner, J. A. (1996). Weak convergence and empirical Processes: with applications to statistics. New York: Springer.
Book MATH Google Scholar
Wang, M., Zhang, X., Wan, A. T. K., you, K., Zou, G. (2021). Combination forecasts of output growth in a seven-country data set. Biometrics, 2021, 1–12.
Google Scholar
Wheelock, D. C., Wilson, P. W. (2008). Non-parametric, unconditional quantile estimation for efficiency analysis with an application to Federal Reserve check processing operations. Journal of Econometrics, 145, 209–225.
Article MathSciNet MATH Google Scholar
Winnepenninckx, V., Lazar, V., Michiels, S., Dessen, P., Stas, M., Alonso, S. R., Avril, M., Romero, P. L., Robert, T., Balacescu, O., Eggermont, A. M., Lenoir, G., Sarasin, A., Tursz, T., Oord, J. J., Spatz, A. (2006). Gene expression profiling of primary cutaneous melanoma and clinical outcome. Journal of the National Cancer Institute, 98, 472–482.
Article Google Scholar
Wu, M., Huang, J., Ma, S. (2017). Identifying gene-gene interactions using penalized tensor regression. Statistics in Medicine, 37, 598–610.
Article MathSciNet Google Scholar
Xu, Y., Wu, M., Ma, S., Ahmed, S. (2018). Robust gene environment interaction analysis using penalized trimmed regression. Journal of Statistical Computation and Simulation, 88, 3502–3528.
Article MathSciNet MATH Google Scholar
Yang, Y. (2001). Adaptive Regression by Mixing. Journal of the American Statistical Association, 96, 574–588.
Article MathSciNet MATH Google Scholar
Yang, Y. (2007). Prediction/estimation with simple linear models: Is it really that simple? Econometric Theory, 23, 1–36.
Article MathSciNet MATH Google Scholar
Ye, C., Yang, Y., Yang, Y. (2018). Sparsity oriented importance learning for high-dimensional linear regression. Journal of the American Statistical Association, 113, 1797–1812.
Article MathSciNet MATH Google Scholar
Zhan, Z., Yang, Y. (2022). Profile electoral college cross-validation Information Sciences, 586, 24–40.
Google Scholar
Zhu, R., Wan, A. T. K., Zhang, X., Zhou, G. (2019). A Mallows-type model averaging estimator for the varying coefficient partially linear model. Journal of the American Statistical Association, 114, 882–892.
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Lin’s work was supported by the Fundamental Research Funds for the Central Universities and the Research Funds of Renmin University of China (No. 19XNB014).

Author information

Authors and Affiliations

School of Statistics, Renmin University of China, No. 59 Zhongguancun Street, Haidian District, Beijing, 100872, People’s Republic of China
Zishu Zhan
Center for Applied Statistics and School of Statistics, Renmin University of China, No. 59 Zhongguancun Street, Haidian District, Beijing, 100872, People’s Republic of China
Yang Li & Cunjie Lin
School of Statistics, University of Minnesota, 313 Ford Hall, 224 Church Street SE, Minneapolis, MN, 55455, USA
Yuhong Yang

Authors

Zishu Zhan
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuhong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Cunjie Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cunjie Lin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 367 KB)

Appendix

1.1 A.1. Proof of Theorem 1

We first introduce the following lemma, which is a direct result of Mack and Silverman (1982) and will be used in our proofs.

Lemma 1

Let $(\textbf{X}_{1},Y_{1}),... ,(\textbf{X}_{n},Y_{n})$ be i.i.d. random vectors, where $Y_{1},... , Y_n$ are scalar random variables. Assume that $E|Y|^{r}<\infty$ and $\sup _\textbf{x}\int |y|^{r}f(\textbf{x},y)dy<\infty$, where f denotes the joint density of $(\textbf{X},Y)$. Let K be a bounded positive function with bounded support, satisfying a Lipschitz condition. Then,

$$\begin{aligned} \mathop {\sup }_{\textbf{x}\in D}\left| \frac{1}{n}\sum _{i=1}^n{K_h\left( \textbf{X}_{i}-\textbf{x}\right) Y_{i}-E\left[ K_h\left( \textbf{X}_{i}-\textbf{x}\right) Y_{i}\right] }\right| =O_{p}\left( \frac{{\log }^{1/2}(1/h)}{\sqrt{nh}}\right) , \end{aligned}$$

provided that $n^{2\eta -1}h \rightarrow \infty$ for some $\eta <1-r^{-1}$.

Now, we prove the results of Theorem 1. First, we introduce Knight’s identity (Knight 1988), which will be used in the following proof,

$$\begin{aligned} \rho _\tau (u+v)-\rho _\tau (u)=v\psi _\tau (u)+\int _0^{-v}\left[ I(u\le z)-I(u\le 0)\right] dz. \end{aligned}$$

(4)

For given u, $\tau$, and j, define $\varepsilon _{\tau ,i}=Y_i-Q_{\tau }({\textbf{X}}_i,U_i),$ $r_{i(j)}=Q_{\tau }({\textbf {X}}_i,U_i)-{{\textbf {X}}}_{i(j)}^{\top }{{\varvec{\theta }}}^*_{(j)}$. Recall that ${{\varvec{\theta }}}_{(j)}^{*}=(a_{j}^{*},b^{*}_{j},{{\varvec{\beta }}}^{*\top }_{(j)})^{\top }$. To simplify the notation, we use a shorthand $\varepsilon _i$ and $\varepsilon$ for $\varepsilon _{\tau ,i}$ and $\varepsilon _{\tau }$, respectively. For ${{\varvec{\theta }}}_{(j)}\in \mathbb {R}^{p+1}$, we define

$$\begin{aligned} G_j(\tau , {\varvec{\theta }}_{(j)})=\sum _{i=1}^{n}\left[ \rho _{\tau }(Y_{i}-\textbf{X}_{i(j)}^\top {{\varvec{\theta }}}_{(j)})K_{h_{j}}(U_{i}-u)\right] . \end{aligned}$$

We will show that for any $s>0$, there is a constant $M>0$ such that for all n sufficiently large, we have

$$\begin{aligned} P\left\{ \inf _{\left\| \textbf{v}_{(j)}\right\| =M} G_j(\tau , {\varvec{\theta }}_{(j)}^{s})>G_j(\tau , {\varvec{\theta }}_{(j)}^{*})\right\} \ge 1-s, \end{aligned}$$

(5)

where ${\varvec{\theta }}_{(j)}^{s}={\varvec{\theta }}_{(j)}^{*}+\delta _s\textbf{v}_{(j)}$, $\textbf{v}_{(j)}=(v_1,... ,v_{p+1})^{\top }$, and $\delta _s=o(1)$. By the Knight’s identity, we obtain

$$\begin{aligned}{} & {} G_j(\tau , {\varvec{\theta }}_{(j)}^{s})-G_j(\tau , {\varvec{\theta }}_{(j)}^{*})\\{} & {} \quad =\sum _{i=1}^{n}\left[ \rho _{\tau }(Y_{i}-\textbf{X}_{i(j)}^\top {{\varvec{\theta }}}_{(j)}^{s})K_{h_{j}}(U_{i}-u)\right] -\sum _{i=1}^{n}\left[ \rho _{\tau }(Y_{i}-\textbf{X}_{i(j)}^\top {{\varvec{\theta }}}_{(j)}^{*})K_{h_{j}}(U_{i}-u)\right] \\{} & {} \quad =-\delta _s\sum _{i=1}^{n}K_{h_j}(U_i-u)\psi _{\tau }(\varepsilon _i+r_{i(j)})\textbf{X}_{i(j)}^{\top }\textbf{v}_{(j)}\\{} & {} \quad \,\,\,+\sum _{i=1}^{n}K_{h_j}(U_i-u)\int _0^{\delta _s\textbf{X}_{i(j)}^{\top }\textbf{v}_{(j)}}[I(\varepsilon _{i}\le -r_{i(j)}+z)-I(\varepsilon _{i}\le -r_{i(j)})]dz\\{} & {} \quad \equiv G_{j, 1}\left( \textbf{v}_{(j)}\right) +G_{j, 2}\left( \textbf{v}_{(j)}\right) +G_{j, 3}\left( \textbf{v}_{(j)}\right) , \end{aligned}$$

where

$$\begin{aligned}{} & {} G_{j, 1}\left( \textbf{v}_{(j)}\right) =-\delta _s \sum _{i=1}^{n}K_{h_j}(U_i-u) \psi _{\tau }\left( \varepsilon _{i}+r_{i(j)}\right) \textbf{X}_{i(j)}^{\top } \textbf{v}_{(j)}, \\{} & {} G_{j, 2}\left( \textbf{v}_{(j)}\right) =\sum _{i=1}^{n} E\left[ K_{h_j}(U_i-u)\int _{0}^{\delta _s \textbf{X}_{i(j)}^{\top } \textbf{v}_{(j)}} [I(\varepsilon _{i}\le -r_{i(j)}+z)-I(\varepsilon _{i}\le -r_{i(j)})]dz \Big | \textbf{X}_{i},U_i\right] , \\{} & {} G_{j, 3}\left( \textbf{v}_{(j)}\right) =\sum _{i=1}^{n}K_{h_j}(U_i-u)\int _{0}^{\delta _s \textbf{X}_{i(j)}^{\top } \textbf{v}_{(j)}}[I(\varepsilon _{i}\le -r_{i(j)}+z)-I(\varepsilon _{i}\le -r_{i(j)})]dz\\{} & {} \,\,\,\,\,\,\,\,\,\,\,\,\,\,-\sum _{i=1}^nE\left[ K_{h_j}(U_i-u)\int _{0}^{\delta _s \textbf{X}_{i(j)}^{\top } \textbf{v}_{(j)}} [I(\varepsilon _{i}\le -r_{i(j)}+z)-I(\varepsilon _{i}\le -r_{i(j)})]dz \Big |\textbf{X}_{i},U_i\right] . \end{aligned}$$

By equation (2), we have $E\left[ G_{j, 1}\left( \textbf{v}_{(j)}\right) \right] =0$. By Assumptions (A6) and (A9),

$$\begin{aligned} E\left( G_{j, 1}\left( \textbf{v}_{(j)}\right) \right) ^2\le C_K^2\delta _s^2\sum _{i=1}^{n}{} \textbf{v}_{(j)}^{\top }\textbf{B}_{(j)}(u)\textbf{v}_{(j)}\le C_K^2\overline{C}_{B_{(j)}}n\delta _s^2\Vert \textbf{v}_{(j)}\Vert ^2, \end{aligned}$$

where $C_K$ is a finite positive constant. Hence $G_{j, 1}\left( \textbf{v}_{(j)}\right) =O_p\left( \overline{C}_{B_{(j)}}^{1/2}\delta _s\sqrt{n}\right) \Vert \textbf{v}_{(j)})\Vert$. By Taylor expansion, we have

$$\begin{aligned} G_{j,2}\left( \textbf{v}_{(j)}\right)= & {} \frac{1}{2}\delta _s^2\textbf{v}_{(j)}^{\top }\sum _{i=1}^{n}K_{h_j}(U_i-u)[f(-r_{i(j)}|\textbf{X}_i,U_i) +o(1)]\textbf{X}_{i(j)}\textbf{X}_{i(j)}^{\top }\textbf{v}_{(j)}\\= & {} \frac{1}{2}\delta _s^2\textbf{v}_{(j)}^{\top }n\left[ f_U(u)\textbf{A}_{(j)}(u)+o(1)\right] \textbf{v}_{(j)}\\\ge & {} \frac{n\delta _s^2f_U(u)}{2}\underline{C}_{A_{(j)}}\Vert \textbf{v}_{(j)}\Vert ^2. \end{aligned}$$

Analogous to $G_{j, 1}\left( \textbf{v}_{(j)}\right)$, by Assumption (A8), we obtain that $G_{j, 3}\left( \textbf{v}_{(j)}\right) =O_p(\overline{C}_{A_{(j)}}^{1/2}\delta _s\sqrt{n})\Vert \textbf{v}_{(j)}\Vert$. Thus we get (5), which implies that with probability approaching to 1, there exists a local minimum $\widehat{{\varvec{\theta }}}_{(j)}$ in the ball $\mathcal {B}_{M,\delta _s}=\left\{ {\varvec{\theta }}_{(j)}^{*}+\delta _s\textbf{v}_{(j)}:\Vert \textbf{v}_{(j)}\Vert \le M\right\}$ such that $\Vert \widehat{{\varvec{\theta }}}_{(j)}-{{\varvec{\theta }}}_{(j)}^{*}\Vert =O_p(\delta _s)=o_p(1)$. By the convexity of $G_j(\tau ,{\varvec{\theta }}_{(j)})$, $\widehat{{\varvec{\theta }}}_{(j)}$ is also the global minimum. Thus, Theorem 1 is proved. $\square$

1.2 A.2. Proof of Theorem 2

For given $\tau$ and j, recall that $\varepsilon _{i}=Y_i-Q_{\tau }({\textbf{X}}_i,U_i),$ $r_{i(j)}=Q_{\tau }({\textbf {X}}_i,U_i)-{{\textbf {X}}}_{i(j)}^{\top }{{\varvec{\theta }}}^*_{(j)}$. Let $\widehat{{\varvec{\omega }}}_{j}=\sqrt{nh_{j}}\left( \widehat{a}_{j}-a_j^{*}, \widehat{{\varvec{\beta }}}^{\top }_{(j)}-{{\varvec{\beta }}}^{*\top }_{(j)},h_{j}(\widehat{b}_{j}-b_j^{*})\right) ^{\top }$. It follows from Theorem 1 in Cai and Xu (2008) that

$$\begin{aligned} \widehat{{\varvec{\omega }}}_{j}=-f_{U}^{-1}(u)\textbf{S}_{j}^{-1}(u)\mathbf{}\textbf{W}_{n,j}(u)+o_{p}(1), \end{aligned}$$

where

$$\begin{aligned}{} & {} \textbf{W}_{n,j}(u)=\frac{1}{\sqrt{nh_{j}}}\sum _{i=1}^nK_{h_{j}}(U_{i}-u) \psi _{\tau }(\varepsilon _i+r_{i(j)})\textbf{X}_{i(j)}^{\circ },\\{} & {} \textbf{S}_j(u)=E\left[ f(-r_{(j)}|{\textbf {X}},U){\textbf {X}}_{(j)}^{\circ }{\textbf {X}}_{(j)}^{\circ \top }|U=u\right] , \end{aligned}$$

where ${{\textbf {X}}}^{\circ }_{i(j)}=\left( X_{ij},X_{i1},... \ ,X_{i(j-1)},X_{i(j+1)},... \ ,X_{ip},(U_{i}-u)X_{ij}/h_{j}\right) ^{\top }$. So we have

$$\begin{aligned} \sqrt{nh_{j}}\left( \widehat{{\varvec{\theta }}}_{(-j)}-{{\varvec{\theta }}}_{(-j)}^{*}\right) =-f_{U}^{-1}(u)\textbf{C}_{(j)}^{-1}(u)\widetilde{\textbf{W}}_{n,j}(u)+o_{p}(1), \end{aligned}$$

(6)

where $\widetilde{\textbf{W}}_{n,j}(u)=\frac{1}{\sqrt{nh_{j}}}\sum _{i=1}^{n}K_{h_{j}} (U_{i}-u)\psi _{\tau }(\varepsilon _i+r_{i(j)})\widetilde{\textbf{X}}_{i(j)}$, and $\widetilde{{\textbf {X}}}_{i(j)}=(X_{ij},X_{i1},... ,X_{i(j-1)},X_{i(j+1)},... ,X_{ip})^\top$. Noting that $E\left( \widetilde{\textbf{W}}_{n,j}(u)\right) =0$ by (2), and

$$\begin{aligned}{} & {} \textrm{Var}\left( \widetilde{\textbf{W}}_{n,j}(u)\right) =\frac{1}{nh_j}\sum _{i=1}^{n}E\left[ K_{h_j}^2(U_i-u) \psi _{\tau }^2(\varepsilon _i+r_{i(j)})\widetilde{{\textbf {X}}}_{i(j)}\widetilde{{\textbf {X}}}_{i(j)}^{\top }\right] \\{} & {} \quad =\frac{1}{n}\sum _{i=1}^{n}v_0f_U(u)E[\psi _{\tau }^2(\varepsilon +r_{(j)})\widetilde{{\textbf {X}}}_{(j)}\widetilde{{\textbf {X}}}_{(j)}^{\top }|U=u]+o(1)\\{} & {} \quad = v_0f_U(u)\textbf{D}_{(j)}(u). \end{aligned}$$

Then for any $\epsilon >0$, define $\eta _{i(j)}=1/\sqrt{nh_j}K_{h_j}(U_i-u)\psi _{\tau }(\varepsilon _i+r_{i(j)})\widetilde{{\textbf {X}}}_{i(j)}$, we have

$$\begin{aligned}{} & {} \sum _{i=1}^{n}E\left\{ \Vert \eta _{i(j)}\Vert ^2I\left[ \Vert \eta _{i(j)}\Vert \ge \epsilon \right] \right\} \\{} & {} \quad =nE\left\{ \Vert \eta _{i(j)}\Vert ^2I[\Vert \eta _{i(j)}\Vert \ge \epsilon ]\right\} \le n\left\{ E\Vert \eta _{i(j)}\Vert ^4\right\} ^{1/2}\{P(\Vert \eta _{i(j)}\Vert \ge \epsilon )\}^{1/2}\\{} & {} \quad \le n\epsilon ^{-2}E\Vert \eta _{i(j)}\Vert ^4. \end{aligned}$$

Furthermore, by Assumptions (A6) and (A10),

$$\begin{aligned} E\Vert \eta _{i(j)}\Vert ^4= & {} (nh_j)^{-2}E\left\{ K_{h_j}^4(U_i-u)\left[ \textrm{tr}\left( \psi _{\tau }^2(\varepsilon _i+r_{i(j)}) \widetilde{{\textbf {X}}}_{i(j)}\widetilde{{\textbf {X}}}_{i(j)}^{\top }\right) \right] ^2\right\} \\\le & {} (nh_j)^{-2}C_K^4E\left\{ \left[ \textrm{tr}\left( \widetilde{{\textbf {X}}}_{i(j)}\widetilde{{\textbf {X}}}_{i(j)}^{\top } \right) \right] ^2\right\} \\\le & {} (nh_j)^{-2}C_K^4E\left\| \widetilde{{\textbf {X}}}_{i(j)}\right\| ^4 =O\left( (nh_j)^{-2}\right) . \end{aligned}$$

Thus, $\sum _{i=1}^{n}E\left\{ \Vert \eta _{i(j)}\Vert ^2I\left[ \Vert \eta _{i(j)}\Vert \ge \epsilon \right] \right\} =O\left( (nh_j^2)^{-1}\right) =o(1).$ According to the Lindeberg–Feller central limit theorem, we obtain

$$\begin{aligned} \widetilde{\textbf{W}}_{n,j}(u) {\mathop {\rightarrow }\limits ^{d}} N\left( 0, v_0f_U(u)\textbf{D}_{(j)}(u)\right) . \end{aligned}$$

By the Slusky’s theorem, we have

$$\begin{aligned} \sqrt{nh_j}\left( \widehat{{\varvec{\theta }}}_{(-j)}-{{\varvec{\theta }}}_{(-j)}^{*}\right) {\mathop {\rightarrow }\limits ^{d}} N\left( \textbf{0},\frac{v_{0}}{f_{U}(u)}{} \textbf{C}^{-1}_{(j)}(u)\textbf{D}_{(j)}(u)\textbf{C}^{-1}_{(j)}(u)\right) . \end{aligned}$$

Therefore, the proof of Theorem 2 is completed. $\square$

1.3 A.3. Proof of Theorem 3

To show the results, it suffices to show that $\sup _{\textbf{w}\in \mathcal {H}}\left| \frac{CV_{n_0}(\textbf{w})-QPE_{n}(\textbf{w})}{QPE_{n}(\textbf{w})}\right| =o_{p}(1).$ For notation simplicity, for a given $\tau$, let $\widehat{Q}_{j,n_0}(\cdot )=\widehat{Q}_{\tau ,n_0}^{(j)}(\cdot )$, $\widehat{Q}_{j}(\cdot )=\widehat{Q}_{\tau }^{(j)}(\cdot )$. By the definition of $CV_{n_0}({\textbf { w}})$ and $QPE_{n}(\textbf{w})$, we have

$$\begin{aligned} \begin{aligned} CV_{n_0}(\textbf{w})-QPE_{n}(\textbf{w})&= \frac{1}{n-n_0}\sum _{i=n_0+1}^{n}\left[ \rho _\tau \left( Y_i-\sum _{j=1}^pw_j\widehat{Q}_{j,n_0}({\textbf {X}}_i,U_i)\right) -\rho _\tau (\varepsilon _i)\right] \\&\,+\left[ E\rho _\tau (\varepsilon )-QPE_{n}(\textbf{w})\right] +\frac{1}{n-n_0}\sum _{i=n_0+1}^n\left[ \rho _\tau (\varepsilon _i)-E\rho _\tau (\varepsilon )\right] . \end{aligned} \end{aligned}$$

Noting that $E\left[ \left( Q_{\tau }({\textbf {X}},U)-\sum _{j=1}^pw_j\widehat{Q}_j({\textbf {X}},U)\right) \psi _\tau (\varepsilon )\bigg |\mathcal {D}_n\right] =0$ and

$$\begin{aligned}{} & {} E\left[ \int _0^{\sum _{j=1}^pw_j\widehat{\scriptscriptstyle {Q}}_j(\scriptscriptstyle {\textbf{X}, U})-Q_{\tau }(\scriptscriptstyle {\textbf{X}, U})}[I(\varepsilon \le z)-I(\varepsilon \le 0)]dz\bigg |\mathcal {D}_n\right] \\{} & {} \quad = E_{\textbf{X}_i,U_i}\left\{ \int _0^{\sum _{j=1}^p w_j\widehat{\scriptscriptstyle {Q}}_j(\textbf{X}_i,U_i)-Q_{\tau }(\textbf{X}_i,U_i)}[F(z|{\textbf {X}}_i,U_i)-F(0|{\textbf {X}}_i,U_i)]dz\right\} , \end{aligned}$$

where $E_{\textbf{X}_i,U_i}$ denote the expectation with respect to $\{{\textbf {X}}_i,U_i\}$. Together with the Knight’s identity, we get the following decomposition expression

$$\begin{aligned} CV_{n_0}(\textbf{w})-QPE_{n}(\textbf{w})= CV_1({\textbf { w}})+CV_2({\textbf { w}})+CV_3({\textbf { w}})+CV_4({\textbf { w}})+CV_5, \end{aligned}$$

where

$$\begin{aligned}{} & {} CV_1({\textbf { w}})=\frac{1}{n-n_0}\sum _{i=n_0+1}^n\left[ (Q_{\tau }({\textbf {X}}_i,U_i)-\sum _{j=1}^pw_j \widehat{Q}_{j,n_0}({\textbf {X}}_i,U_i))\psi _\tau (\varepsilon _i)\right] ,\\{} & {} CV_2({\textbf { w}})=\frac{1}{n-n_0}\sum _{i=n_0+1}^n\int _0^{\sum _{j=1}^pw_j\widehat{\scriptscriptstyle {Q}}_{j,n_0} (\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}[I(\varepsilon _i\le z)-I(\varepsilon _i\le 0)]\\{} & {} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,-[F(z|{\textbf {X}}_i,U_i)-F(0|{\textbf {X}}_i,U_i)]dz,\\{} & {} CV_3({\textbf { w}})=\frac{1}{n-n_0}\sum _{i=n_0+1}^n\Bigg [\int _0^{\sum _{j=1}^pw_j \widehat{\scriptscriptstyle {Q}}_{j,n_0}(\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}[F(z|{\textbf {X}}_i,U_i)-F(0|{\textbf {X}}_i,U_i)]dz\\{} & {} \,\,\,\,\,\,\,\,\,\,\,-E_{\scriptscriptstyle {\textbf{X}_i, U_i}}\left\{ \int _0^{\sum _{j=1}^p w_j\widehat{Q}_{j,n_0}(\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}[F(z|{\textbf {X}}_i,U_i)-F(0|{\textbf {X}}_i,U_i)]dz\right\} \Bigg ], \end{aligned}$$

$$\begin{aligned}{} & {} CV_4({\textbf { w}})=\frac{1}{n-n_0}\sum _{i=n_0+1}^n \Bigg [ E_{\scriptscriptstyle {\textbf{X}_i, U_i}} \Bigg \{\int _0^{\sum _{j=1}^p w_j\widehat{\scriptscriptstyle {Q}}_{j,n_0}(\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}[F(z|{\textbf {X}}_i,U_i)\\{} & {} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,-F(0|{\textbf {X}}_i,U_i)]dz\Bigg \}\\{} & {} \,\,\,\,\,\,\,\,\,\,\,-E_{\scriptscriptstyle {\textbf{X}_i, U_i}}\left\{ \int _0^{\sum _{j=1}^p w_j\widehat{\scriptscriptstyle {Q}}_j (\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}[F(z|{\textbf {X}}_i,U_i)-F(0|{\textbf {X}}_i,U_i)]dz\right\} \Bigg ],\\{} & {} CV_5=\frac{1}{n-n_0}\sum _{i=n_0+1}^n\left[ \rho _\tau (\varepsilon _i) -E\rho _\tau (\varepsilon )\right] . \end{aligned}$$

Next, we will show that

(i) $\min _{\textbf{w}\in \mathcal {H}}QPE_{n}(\textbf{w})\ge E[\rho _{\tau }(\varepsilon )]-o_{p}(1)$;

(ii) $\sup _{\textbf{w}\in \mathcal {H}}|CV_{1}(\textbf{w})|=o_{p}(1)$;

(iii) $\sup _{\textbf{w}\in \mathcal {H}}|CV_{2}(\textbf{w})|=o_{p}(1)$;

(iv) $\sup _{\textbf{w}\in \mathcal {H}}|CV_{3}(\textbf{w})|=o_{p}(1)$;

(v) $\sup _{\textbf{w}\in \mathcal {H}}|CV_{4}(\textbf{w})|=o_{p}(1)$;

(vi) $CV_{5}=o_{p}(1)$.

(i). Using (4) again, define $Q_j^*({\textbf {X}},U)=\widetilde{{\textbf {X}}}_{(j)}^{\top }{{\varvec{\theta }}^*_{(-j)}}$, we get

$$\begin{aligned}{} & {} QPE_{n}(\textbf{w})-E[\rho _\tau (\varepsilon +r({\textbf { w}}))]\\{} & {} \quad = E\left\{ \rho _\tau \left( \varepsilon +r({\textbf { w}})+\sum _{j=1}^pw_j[Q_j^*({\textbf {X}},U)-\widehat{Q}_j({\textbf {X}},U)]\right) -\rho _\tau (\varepsilon +r({\textbf { w}}))\bigg |\mathcal {D}_n\right\} \\{} & {} \quad =E\left\{ \int _0^{\sum _{j=1}^pw_j[\widehat{\scriptscriptstyle {Q}}_j(\scriptscriptstyle {\textbf{X}, U})-Q_j^*(\scriptscriptstyle {\textbf{X}, U})]}[I(\varepsilon +r({\textbf { w}})\le z)-I(\varepsilon +r({\textbf { w}})\le 0)]dz\bigg |\mathcal {D}_n\right\} \\{} & {} \quad = E_{\scriptscriptstyle {\textbf{X}, U}}\left\{ \int _0^{\sum _{j=1}^pw_j[\widehat{\scriptscriptstyle {Q}}_j(\scriptscriptstyle {\textbf{X}, U})-Q_j^*(\scriptscriptstyle {\textbf{X}, U})]} [F(z-r({\textbf { w}})|{\textbf {X}},U)-F(-r({\textbf { w}})|{\textbf {X}},U)]dz\right\} , \end{aligned}$$

where $E_{\textbf{X},U}$ denote the expectation with respect to $\{{\textbf {X}},U\}$. By Taylor’s expansion and Jensen’s inequality, we have that

$$\begin{aligned}{} & {} \Big |QPE_{n}(\textbf{w})-E[\rho _\tau (\varepsilon +r({\textbf { w}}))]\Big |\\{} & {} \quad =\left| E_{\scriptscriptstyle {\textbf{X}, U}}\left\{ \int _0^{\sum _{j=1}^pw_j[\widehat{\scriptscriptstyle {Q}}_j(\scriptscriptstyle {\textbf{X}, U})-Q_j^*(\scriptscriptstyle {\textbf{X}, U})]} zf(-r({\textbf { w}})|{\textbf {X}},U)dz\right\} +o_p(1)\right| \\{} & {} \quad =\left| E_{\scriptscriptstyle {\textbf{X}, U}}\left\{ \frac{1}{2}f(-r({\textbf { w}})|{\textbf {X}},U)\left[ \sum _{j=1}^pw_j [\widehat{Q}_j({\textbf {X}},U)-Q_j^*({\textbf {X}},U)]\right] ^2\right\} +o_p(1)\right| \\{} & {} \quad \le \left| E_{\scriptscriptstyle {\textbf{X}, U}}\left\{ \frac{1}{2}f(-r({\textbf { w}})|{\textbf {X}},U)\sum _{j=1}^pw_j \left[ \widehat{Q}_j({\textbf {X}},U)-Q_j^*({\textbf {X}},U)\right] ^2\right\} +o_p(1)\right| \end{aligned}$$

$$\begin{aligned}{} & {} = \Bigg |\frac{1}{2} E_{\scriptscriptstyle {\textbf{X}, U}}\left\{ \sum _{j=1}^pw_j(\widehat{{\varvec{\theta }}}_{(-j)} -{{\varvec{\theta }}}_{(-j)}^*)^\top E\left[ f(-r({\textbf { w}})|{\textbf {X}},U)\widetilde{{\textbf {X}}}_{(j)}\widetilde{{\textbf {X}}}_{(j)}^ \top \right] (\widehat{{\varvec{\theta }}}_{(-j)}-{{\varvec{\theta }}}_{(-j)}^*)\right\} +o_p(1)\Bigg |\\{} & {} \le \left| \frac{1}{2}\overline{C}_J\max _{1\le j\le p}\Vert \widehat{{\varvec{\theta }}}_{(-j)}-{{\varvec{\theta }}}_{(-j)}^*\Vert +o_p(1)\right| , \end{aligned}$$

the last inequality follows from Assumption (A11). Now it follows from Theorem 1 that,

$$\begin{aligned} \Big |QPE_{n}(\textbf{w})-E[\rho _\tau (\varepsilon +r({\textbf { w}}))]\Big |\le o_p(1). \end{aligned}$$

(7)

Using the fact that $D(t)=E[\rho _{\tau }(\varepsilon +t)-\rho _{\tau }(\varepsilon )]$ has a global minimum at $t=0$, we have $\min _{{\textbf { w}}\in \mathcal {H}}E\left[ \rho _\tau (\varepsilon +r({\textbf { w}}))\right] \ge E[\rho _\tau (\varepsilon )].$ By combining (7), we get

$$\begin{aligned} \min _{\textbf{w}\in \mathcal {H}}QPE_{n}(\textbf{w})) \ge \min _{\textbf{w}\in \mathcal {H}}E[\rho _\tau (\varepsilon +r({\textbf { w}}))]-o_p(1) \ge E[\rho _\tau (\varepsilon )]-o_p(1). \end{aligned}$$

(ii). Define $Q_j^*({\textbf {X}}_i,U_i)=\widetilde{{\textbf {X}}}_{i(j)}^{\top }{{\varvec{\theta }}^*_{(-j)}}$. By simple calculation, we have the following decomposition,

$$\begin{aligned} \begin{aligned} CV_{1}(\textbf{w})&=\frac{1}{n-n_0}\sum _{i=n_0+1}^{n}\left[ Q_{\tau }({\textbf {X}}_i,U_i)-\sum _{j=1}^{p}w_{j}Q_j^*({\textbf {X}}_i,U_i)\right] \psi _{\tau }(\varepsilon _i)\\&\,\,\,\,-\frac{1}{n-n_0}\sum _{i=n_0+1}^{n}\sum _{j=1}^{p}w_{j}\left[ \widehat{Q}_{j,n_0} ({\textbf {X}}_i,U_i)-Q_j^*({\textbf {X}}_i,U_i)\right] \psi _{\tau }(\varepsilon _i)\\&=CV_{11}(\textbf{w})-CV_{12}(\textbf{w}). \end{aligned} \end{aligned}$$

It is easy to show that $E(CV_{11}({\textbf { w}}))=0$ and $\textrm{Var}(CV_{11}({\textbf { w}}))=O(1/(n-n_0)),$ which implies that $CV_{11}({\textbf { w}})=o_p(1)$ for each ${\textbf { w}}\in \mathcal {H}$. To show the uniform convergence, we consider the function class $\mathcal {F}=\{g(\varepsilon _i,{\textbf {X}}_i,U_i;{\textbf { w}}): {\textbf { w}}\in \mathcal {H}\},$ where $g(\varepsilon _i,{\textbf {X}}_i,U_i;{\textbf { w}})=\left[ Q_{\tau }({\textbf {X}}_i,U_i)-\sum _{j=1}^{p}w_{j}Q_j^*({\textbf {X}}_i,U_i) \right] \psi _{\tau }(\varepsilon _i)$. On $\mathcal {H}$, we define the metric $|\cdot |_1$ as $|{\textbf { w}}-\tilde{{\textbf { w}}}|_1=\sum _{j=1}^p|w_j-\tilde{w}_j|$, for any ${\textbf { w}}=(w_1,... ,w_p)\in \mathcal {H}$ and $\tilde{{\textbf { w}}}=(\tilde{w}_1,... ,\tilde{w}_p)\in \mathcal {H}$. Then, the $\epsilon$-covering number of $\mathcal {H}$ with respect to $|\cdot |_1$ is $\mathcal {N}(\epsilon ,\mathcal {H},|\cdot |_1)=O(1/\epsilon ^{p-1})$. Further,

$$\begin{aligned} |g(\varepsilon _i,{\textbf {X}}_i,U_i;{\textbf { w}})-g(\varepsilon _i,{\textbf {X}}_i,U_i;\tilde{{\textbf { w}}})|= & {} \left| \sum _{j=1}^p(w_j-\tilde{w}_j)Q_j^*({\textbf {X}}_i,U_i)\psi _\tau (\varepsilon _i)\right| \\\le & {} C_\theta |{\textbf { w}}-\tilde{{\textbf { w}}}|_1\max _{1\le j\le p}\Vert \widetilde{{\textbf {X}}}_{i(j)}\Vert , \end{aligned}$$

where $C_\theta =p\max _{1\le j\le p}\Vert {{\varvec{\theta }}}_{(-j)}^*\Vert =O(p^{3/2})$ and $E\max _{1\le j\le p}\Vert \widetilde{{\textbf {X}}}_{i(j)}\Vert <\infty$ by Assumption (A3). For a fix p, this yields that the $\epsilon$-bracketing number of $\mathcal {F}$ with respect to the $L_1$-norm is $\mathcal {N}_{[]}(\epsilon ,\mathcal {F},L_1(P))\le C/\epsilon ^{p-1}$ for some constant C. By Theorem 2.4.1 of Van der Vaart and Wellner (1996), we conclude that $\mathcal {F}$ is Glivenko–Cantelli. And it follows from Glivenko–Cantelli theorem that $\sup _{{\textbf { w}}\in \mathcal {H}}|CV_{11}({\textbf { w}})|=o_p(1)$. By the Cauchy–Schwarz inequality,

$$\begin{aligned}{} & {} \mathop {\sup }_{\textbf{w}\in \mathcal {H}}|CV_{12}(\textbf{w})| \triangleq \mathop {\sup }_{\textbf{w}\in \mathcal {H}}\left| \frac{1}{n-n_0} \sum _{i=n_0+1}^{n}\sum _{j=1}^{p}w_{j}\left[ \widehat{Q}_{j,n_0}({\textbf {X}}_i,U_i) -Q_j^*({\textbf {X}}_i,U_i)\right] \psi _{\tau }(\varepsilon _i)\right| \\{} & {} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\le \mathop {\sup }_{\textbf{w}\in \mathcal {H}}\sum _{j=1}^{p}w_{j}\frac{1}{n-n_0}\sum _{i=n_0+1}^{n} \left| \left[ \widetilde{{\textbf {X}}}_{i(j)}^\top (\widehat{{\varvec{\theta }}}_{(-j),n_0} -{{\varvec{\theta }}}_{(-j)}^*)\right] \psi _{\tau }(\varepsilon _i)\right| \\{} & {} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\le \sum _{j=1}^p\frac{1}{n-n_0}\sum _{i=n_0+1}^n \left| \widetilde{{\textbf {X}}}_{i(j)}^\top (\widehat{{\varvec{\theta }}}_{(-j),n_0} -{{\varvec{\theta }}}_{(-j)}^*)\right| \\{} & {} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\le \sum _{j=1}^p\max _{n_0+1\le i\le n}\Vert \widetilde{{\textbf {X}}}_{i(j)}\Vert \Vert \widehat{{\varvec{\theta }}}_{(-j),n_0}-{{\varvec{\theta }}}_{(-j)}^*\Vert , \end{aligned}$$

where $\widehat{{\varvec{\theta }}}_{(-j),n_0}=(\widehat{a}_{j,n_0},\widehat{{\varvec{\beta }}}_{(j),n_0}^{\top })^{\top }$, then by Theorem 1 and Assumption (A3), we get $\mathop {\sup }_{\textbf{w}\in \mathcal {H}}|CV_{12}(\textbf{w})|=o_p(1).$

(iii) To prove (iii), we rewrite $CV_2({\textbf { w}})=CV_{21}({\textbf { w}})+CV_{22}({\textbf { w}})$, where

$$\begin{aligned}{} & {} CV_{21}({\textbf { w}})=\frac{1}{n-n_0}\sum _{i=n_0+1}^n\int _0^{\sum _{j=1}^pw_j{\scriptscriptstyle {Q}}_j^*(\scriptscriptstyle {\textbf{X}_i, U_i}) -Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})} [I(\varepsilon _i\le z)-I(\varepsilon _i\le 0)\\{} & {} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,-F(z|{\textbf {X}}_i,U_i)+F(0|{\textbf {X}}_i,U_i)]dz,\\{} & {} CV_{22}({\textbf { w}})=\frac{1}{n-n_0}\sum _{i=n_0+1}^n\int _{\sum _{j=1}^pw_j{\scriptscriptstyle {Q}}_j^*(\scriptscriptstyle {\textbf{X}_i, U_i}) -Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}^{\sum _{j=1}^pw_j\widehat{\scriptscriptstyle {Q}}_{j,n_0}(\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})} [I(\varepsilon _i\le z)-I(\varepsilon _i\le 0)\\{} & {} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,-F(z|{\textbf {X}}_i,U_i)+F(0|{\textbf {X}}_i,U_i)]dz. \end{aligned}$$

Noting that $E[CV_{21}({\textbf { w}})]=0$, $\textrm{Var}(CV_{21}({\textbf { w}}))=O(1/n-n_0)$. Analogous to the proof of $CV_{11}({\textbf { w}})$, we can show that $\sup _{w\in \mathcal {H}}|CV_{21}({\textbf { w}})|=o_p(1)$. On the other hand,

$$\begin{aligned} |CV_{22}({\textbf { w}})| \le \frac{2}{n-n_0}\sum _{i=n_0+1}^n\sum _{j=1}^p\left| w_j\left[ \widehat{Q}_{j,n_0}({\textbf {X}}_i,U_i)-Q_j^*({\textbf {X}}_i,U_i) \right] \right| , \end{aligned}$$

similar to $CV_{12}({\textbf { w}})$, we have $\sup _{{\textbf { w}}\in \mathcal {H}}|CV_{12}({\textbf { w}})|=o_p(1).$

(iv) We also decompose $CV_3({\textbf { w}})=CV_{31}({\textbf { w}})+CV_{32}({\textbf { w}})$ with

$$\begin{aligned}{} & {} CV_{31}({\textbf { w}})=\frac{1}{n-n_0}\sum _{i=n_0+1}^n\Bigg \{\int _0^{\sum _{j=1}^pw_j{\scriptscriptstyle {Q}}_j^*(\scriptscriptstyle {\textbf{X}_i, U_i}) -Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}[F(z|{\textbf {X}}_i,U_i)-F(0|{\textbf {X}}_i,U_i)]dz\\{} & {} \,\,\,\,\,\,\,\,\,\,\,-E_{\scriptscriptstyle {\textbf{X}_i, U_i}}\left[ \int _0^{\sum _{j=1}^pw_j{\scriptscriptstyle {Q}}_j^*(\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau } (\scriptscriptstyle {\textbf{X}_i, U_i})}[F(z|{\textbf {X}}_i,U_i)-F(0|{\textbf {X}}_i,U_i)]dz\right] \Bigg \},\\ \end{aligned}$$

$$\begin{aligned}{} & {} CV_{32}({\textbf { w}})=\frac{1}{n-n_0}\sum _{i=n_0+1}^n\Bigg \{\int _{\sum _{j=1}^pw_jQ_j^*(\scriptscriptstyle {\textbf{X}_i, U_i}) -Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}^{\sum _{j=1}^pw_j\widehat{\scriptscriptstyle {Q}}_{j,n_0}(\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}[F(z|{\textbf {X}}_i,U_i)-F(0|{\textbf {X}}_i,U_i)]dz\\{} & {} \,\,\,\,\,\,\,\,\,\,\,-E_{\scriptscriptstyle {\textbf{X}_i, U_i}}\left[ \int _{\sum _{j=1}^pw_j{\scriptscriptstyle {Q}}_j^*(\scriptscriptstyle {\textbf{X}_i, U_i}) -Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}^{\sum _{j=1}^pw_j\widehat{\scriptscriptstyle {Q}}_{j,n_0}(\scriptscriptstyle {\textbf{X}_i, U_i})-Q_{\tau }(\scriptscriptstyle {\textbf{X}_i, U_i})}[F(z|{\textbf {X}}_i,U_i) -F(0|{\textbf {X}}_i,U_i)]dz\right] \Bigg \}. \end{aligned}$$

Similar to the proof of $\sup _{{\textbf { w}}\in \mathcal {H}}|CV_{11}({\textbf { w}})|=o_p(1),$ we can show that $\sup _{{\textbf { w}}\in \mathcal {H}}|CV_{31}({\textbf { w}})|=o_p(1)$, the details are omitted here.

Noting that

$$\begin{aligned} |CV_{32}({\textbf { w}})|\le & {} \frac{1}{n-n_0}\sum _{i=n_0+1}^n\sum _{j=1}^pw_j\left| \widehat{Q}_{j,n_0}({\textbf {X}}_i,U_i)-Q_j^*({\textbf {X}}_i,U_i)\right| \\{} & {} \,\,+\frac{1}{n-n_0}\sum _{i=n_0+1}^n\sum _{j=1}^pw_jE_{\scriptscriptstyle {\textbf{X}_i, U_i}} \left| \widehat{Q}_{j,n_0}({\textbf {X}}_i,U_i)-Q_j^*({\textbf {X}}_i,U_i)\right| \\= & {} CV_{321}({\textbf { w}})+ CV_{322}({\textbf { w}}). \end{aligned}$$

We can prove that $\sup _{{\textbf { w}}\in \mathcal {H}}|CV_{321}({\textbf { w}})|=o_p(1)$ as shown in $CV_{22}({\textbf { w}})$. Furthermore, by Cauchy–Schwarz inequality, we have

$$\begin{aligned}{} & {} \sup _{{\textbf { w}}\in \mathcal {H}}|CV_{322}({\textbf { w}})|\\{} & {} \quad \le \sup _{{\textbf { w}}\in \mathcal {H}}\frac{1}{n-n_0}\sum _{i=n_0+1}^n\sum _{j=1}^pw_j \left\{ (\widehat{{\varvec{\theta }}}_{(-j),n_0}-{{\varvec{\theta }}}_{(-j)}^*)^\top E[\widetilde{{\textbf {X}}}_{i(j)} \widetilde{{\textbf {X}}}_{i(j)}^\top ](\widehat{{\varvec{\theta }}}_{(-j),n_0}-{{\varvec{\theta }}}_{(-j)}^*)\right\} ^{1/2}\\{} & {} \quad \le \max _{n_0+1\le i\le n}\max _{1\le j\le p}\lambda _{\max }^{1/2}E[\widetilde{{\textbf {X}}}_{i(j)}\widetilde{{\textbf {X}}}_{i(j)}^\top ]\max _{1\le j\le p}\Vert \widehat{{\varvec{\theta }}}_{(-j),n_0}-{{\varvec{\theta }}}_{(-j)}^*\Vert . \end{aligned}$$

By Assumption (A8) and Theorem 1, we have $\sup _{{\textbf { w}}\in \mathcal {H}}|CV_{322}({\textbf { w}})|=o_p(1).$

(v) To prove (v), we note that

$$\begin{aligned} \sup _{{\textbf { w}}\in \mathcal {H}}|CV_4({\textbf { w}})|\le & {} \frac{1}{n-n_0}\sum _{i=n_0+1}^nE_{\scriptscriptstyle {\textbf{X}_i, U_i}} \left| \sum _{j=1}^pw_j\widehat{Q}_{j,n_0}({\textbf {X}}_i,U_i)-\sum _{j=1}^pw_j\widehat{Q}_j({\textbf {X}}_i,U_i)\right| \\\le & {} \frac{1}{n-n_0}\sum _{i=n_0+1}^n\sum _{j=1}^pw_jE_{\scriptscriptstyle {\textbf{X}_i, U_i}}\left| \widehat{Q}_{j,n_0} ({\textbf {X}}_i,U_i)-\widehat{Q}_j({\textbf {X}}_i,U_i)\right| , \end{aligned}$$

following the proof of $\sup _{{\textbf { w}}\in \mathcal {H}}|CV_{322}({\textbf { w}})|=o_p(1)$, we obtain (v).

(vi) $CV_5=o_p(1)$ follows from the weak law of large numbers.

Finally, we complete the proof of Theorem 3. $\square$

1.4 A.4. Proof of Theorem 4

According to the proof of Theorem 1, we can further obtain that $\Vert \widehat{{\varvec{\theta }}}_{(j)}-{{\varvec{\theta }}}^{*}_{(j)}\Vert =o_p(1)$ and $\Vert \widehat{{\varvec{\theta }}}_{(j),n_0}-{{\varvec{\theta }}}^{*}_{(j)}\Vert =o_p(1)$ uniformly for all $\tau \in \mathcal {T}$. That is to say, we have $\sup _{\tau \in \mathcal {T}}\Vert \widehat{{\varvec{\theta }}}_{(j)}-{{\varvec{\theta }}}^{*}_{(j)}\Vert =o_p(1)$ and $\sup _{\tau \in \mathcal {T}}\Vert \widehat{{\varvec{\theta }}}_{(j),n_0}-{{\varvec{\theta }}}^{*}_{(j)}\Vert =o_p(1)$.

In the following, we prove that $\widehat{\textbf{w}}$ is asymptotically optimal uniformly for $\tau \in \mathcal {T}$. The proof is analogous to the proof of Theorem 3, but is more challenge due to the requirement of the asymptotic optimality of $\widehat{\textbf{w}}$ to hold uniformly in the set of quantile indices. Specifically, we need to prove (i)-(vi) hold but with $\textbf{w}\in \mathcal {H}$ replaced by $(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}$ in Theorem 3.

(a) According to the proof of (i) in Theorem 3, it is easy to obtain that

$$\begin{aligned} \min _{\textbf{w}\in \mathcal {H}}QPE_{\tau ,n}(\textbf{w}) \ge \min _{\textbf{w}\in \mathcal {H}}E[\rho _\tau (\varepsilon +r({\textbf { w}}))]-o_p(1) \ge E[\rho _\tau (\varepsilon )]-o_p(1), \text {for all} \,\tau \in \mathcal {T}, \end{aligned}$$

which is to say that $\mathop {\inf }\limits _{(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}}QPE_{\tau ,n}(\textbf{w})\ge E[\rho _{\tau }(\varepsilon )]-o_{p}(1)$.

(b) We have $CV_1(\textbf{w})=CV_{11}(\textbf{w})-CV_{12}(\textbf{w})$. To prove (b), it suffices to show that $\sup _{(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}}|CV_{11}(\textbf{w})|=o_p(1)$ and $\sup _{(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}}|CV_{12}(\textbf{w})|=o_p(1)$. Similar to the proof of (ii) in Theorem 2, to show the uniform convergence, we consider the class of functions $\mathcal {G}=\{g(\varepsilon _i,\textbf{X}_i,U_i;\textbf{w},\tau ):(\textbf{w},\tau )\in \mathcal {H}\times \mathcal {T}\},$ where $g(\varepsilon _i,{\textbf {X}}_i,U_i;{\textbf { w}},\tau )=\left[ Q_{\tau }({\textbf {X}}_i,U_i) -\sum _{j=1}^{p}w_{j}Q_j^*({\textbf {X}}_i,U_i)\right] \psi _{\tau }(\varepsilon _i)$. On $\mathcal {H}\times \mathcal {T}$, we define the metric $|\cdot |_1^t$ as $|({\textbf { w}},t)-(\tilde{{\textbf { w}}},1-t)|_1^{t}=\sum _{j=1}^p|w_j-\tilde{w}_j|+1-2t$. Then, the $\epsilon$-covering number $\mathcal {N}(\epsilon ,\mathcal {H}\times \mathcal {T},|\cdot |_1^{t})=O(1/\epsilon ^{p-1})$. Further, the $\epsilon$-bracketing number $\mathcal {N}_{[]}(\epsilon ,\mathcal {G},L_1(P))\le C/\epsilon ^{p-1}$, and it follows from Glivenko–Cantelli theorem that $\sup _{(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}}|CV_{11}(\textbf{w})|=o_p(1)$.

We also have

$$\begin{aligned} \mathop {\sup }\limits _{(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}}|CV_{12}(\textbf{w})|\le \sum _{j=1}^p\max _{n_0+1\le i\le n}\Vert \widetilde{{\textbf {X}}}_{i(j)}\Vert \mathop {\sup }\limits _{\tau \in \mathcal {T}}\Vert \widehat{{\varvec{\theta }}}_{(-j),n_0}-{{\varvec{\theta }}}_{(-j)}^*\Vert =o_p(1). \end{aligned}$$

Hence

$$\begin{aligned} \mathop {\sup }\limits _{(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}}|CV_{1}(\textbf{w})|\le \mathop {\sup }\limits _{(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}}|CV_{11}(\textbf{w})|+\mathop {\sup }\limits _{(\textbf{w},\tau )\in \mathcal {H}\times {\mathcal {T}}}|CV_{12}(\textbf{w})|=o_{p}(1). \end{aligned}$$

Similarly, equations (iii), (iv), (v) and (vi) follow from the corresponding proof in Theorem 3 as well as the fact $\sup _{\tau \in \mathcal {T}}\Vert \widehat{{\varvec{\theta }}}_{(j)}-{{\varvec{\theta }}}^{*}_{(j)}\Vert =o_p(1)$ and $\sup _{\tau \in \mathcal {T}}\Vert \widehat{{\varvec{\theta }}}_{(j),n_0}-{{\varvec{\theta }}}^{*}_{(j)}\Vert =o_p(1)$. Therefore, we complete the proof of Theorem 4. $\square$

About this article

Cite this article

Zhan, Z., Li, Y., Yang, Y. et al. Model averaging for semiparametric varying coefficient quantile regression models. Ann Inst Stat Math 75, 649–681 (2023). https://doi.org/10.1007/s10463-022-00857-z

Download citation

Received: 30 November 2021
Revised: 14 July 2022
Accepted: 08 November 2022
Published: 22 December 2022
Issue Date: August 2023
DOI: https://doi.org/10.1007/s10463-022-00857-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model averaging for semiparametric varying coefficient quantile regression models

Abstract

Access this article

Similar content being viewed by others

Empirical Likelihood Test for Regression Coefficients in High Dimensional Partially Linear Models

Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models

Jackknife model averaging for mixed-data kernel-weighted spline quantile regressions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 367 KB)

Appendix

1.1 A.1. Proof of Theorem 1

Lemma 1

1.2 A.2. Proof of Theorem 2

1.3 A.3. Proof of Theorem 3

1.4 A.4. Proof of Theorem 4

About this article

Cite this article

Keywords

Navigation

Model averaging for semiparametric varying coefficient quantile regression models

Abstract

Access this article

Similar content being viewed by others

Empirical Likelihood Test for Regression Coefficients in High Dimensional Partially Linear Models

Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models

Jackknife model averaging for mixed-data kernel-weighted spline quantile regressions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 367 KB)

Appendix

Appendix

1.1 A.1. Proof of Theorem 1

Lemma 1

1.2 A.2. Proof of Theorem 2

1.3 A.3. Proof of Theorem 3

1.4 A.4. Proof of Theorem 4

About this article

Cite this article

Share this article

Keywords

Search

Navigation