Skip to main content
Log in

Quantile regression and variable selection of single-index coefficient model

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

In this paper, a minimizing average check loss estimation (MACLE) procedure is proposed for the single-index coefficient model (SICM) in the framework of quantile regression (QR). The resulting estimators have the asymptotic normality and achieve the best convergence rate. Furthermore, a variable selection method is investigated for the QRSICM by combining MACLE method with the adaptive LASSO penalty, and we also established the oracle property of the proposed variable selection method. Extensive simulations are conducted to assess the finite sample performance of the proposed estimation and variable selection procedure under various error settings. Finally, we present a real-data application of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Cai, Z., Xu, X. (2008). Nonparametric quantile estimations for dynamic smooth coefficient models. Journal of the American Statistical Association, 103, 1595–1608.

  • Fan, J., Li, R. (2001). Variable selection via non-concave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–1360.

  • Fan, J., Yao, Q., Cai, Z. (2003). Adaptive varying-coefficient linear models. Journal of the Royal Statistical Society Series B (Statistical Methodology), 65, 57–80.

  • Feng, S., Xue, L. (2013). Variable selection for single-index varying-coefficient model. Frontiers of Mathematics in China, 8, 541–565.

  • Geyer, C. J. (1994). On the asymptotics of constrained m-estimation. The Annals of Statistics, 22, 1993–2010.

    Article  MathSciNet  MATH  Google Scholar 

  • Härdle, W., Hall, P., Ichimura, H. (1993). Optimal smoothing in single-index models. The Annals of Statistics, 21, 157–178.

  • Hjort, N., Pollard, D. (1993). Asymptotics for minimizers of convex processes (preprint).

  • Honda, T. (2004). Quantile regression in varying coefficient models. Journal of Statistical Planning and Inference, 121, 113–125.

    Article  MathSciNet  MATH  Google Scholar 

  • Huang, Z., Zhang, R. (2013). Profile empirical-likelihood inferences for the single-index-coefficient regression model. Statistics and Computing, 23, 455–465.

  • Jiang, R., Zhou, Z., Qian, W., Shao, W. (2012). Single-index composite quantile regression. Journal of the Korean Statistical Society, 41, 323–332.

  • Kai, B., Li, R., Zou, H. (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. The Annals of Statistics, 39, 305–332.

  • Kim, M.-O. (2007). Quantile regression with varying coefficients. The Annals of Statistics, 35, 92–108.

    Article  MathSciNet  MATH  Google Scholar 

  • Knight, K. (1998). Limiting distributions for \(l_1\) regression estimators under general conditions. The Annals of Statistics, 26, 755–770.

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker, R., Basset, G. S. (1978). Regression quantiles. Econometrica, 46, 33–50.

  • Lu, Z., Tjøstheim, D., Yao, Q. (2007). Adaptive varying-coefficient linear models for stochastic processes: Asymptotic theory. Statistica Sinica, 17, 177–197.

  • Mack, Y. P., Silverman, B. W. (1982). Weak and strong uniform consistency of kernel regression estimates. Probability Theory and Related Fields, 61, 405–415.

  • Shapiro, S. S., Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52, 591–611.

  • Wang, H., Leng, C. (2007). Unified lasso estimation via least squares approximation. Journal of the American Statistical Association, 102, 1039–1048.

  • Wu, T., Yu, K., Yu, Y. (2010). Single-index quantile regression. Journal of Multivariate Analysis, 101, 1607–1621.

  • Xia, Y., Tong, H., Li, W. K. (1999). On extended partially linear single-index models. Biometrika, 86, 831–842.

  • Xue, L., Pang, Z. (2013). Statistical inference for a single-index varying-coefficient model. Statistics and Computing, 23, 589–599.

  • Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894–942.

    Article  MathSciNet  MATH  Google Scholar 

  • Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, 1418–1429.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riquan Zhang.

Additional information

The research was supported in part by National Natural Science Foundation of China (11501372, 11571112), Project of National Social Science Fund (15BTJ027), Doctoral Fund of Ministry of Education of China (20130076110004), Program of Shanghai Subject Chief Scientist (14XD1401600) and the 111 Project of China (B14019).

Appendix

Appendix

To establish the asymptotic properties and the Oracle property of the proposed methods, we need the following regularity conditions:

  1. A.1

    The kernel function \(K(\cdot )\) is a symmetric Lipschitz continues density function with a compact support and it satisfies \(\int _{-\infty }^{\infty }z^2K(z)dz<\infty \),  \(\int _{-\infty }^{\infty }z^jK^2(z)dz<\infty , ~j=0,1,2\);

  2. A.2

    Denote \(\varvec{\Theta }\) as the local neighborhood of \(\varvec{\theta }\) and \(\Xi \) as the compact support of the covariate \(\mathbf {X}\). Let \(\mathcal {U}=\left\{ u=\mathbf {x}^T\varvec{\theta };\mathbf {x} \in \Xi , \varvec{\theta }\in \varvec{\Theta } \right\} \) be the compact support of \(\mathbf {X}^T\varvec{\theta }\) with marginal density \(f_{\mathcal {U}}(u)\). Furthermore, \(f_{\mathcal {U}}(u)\) is first-order Lipschitz continuous and its lower bound is positive;

  3. A.3

    Denote \(u_{\varvec{\theta }}=\mathbf {x}^T\varvec{\theta }\), the index function \(\mathbf {\alpha }(u_{\varvec{\theta }})\) is second order differentiable with respect to \(u_{\varvec{\theta }}\) and it is Lipschitz continues with respect to \(\varvec{\theta }\);

  4. A.4

    Given \(\mathrm {X}^T\varvec{\theta }=u\), the conditional density f(y|u) is Lipschitz continues with respect to y and u;

  5. A.5

    The matrix functions \(\mathrm {E}(\mathbf {X}|\mathbf {X}^T\varvec{\theta }=u)\), \(\mathrm {E}(\mathbf {Z}|\mathbf {X}^T\varvec{\theta }=u)\), \(\mathrm {E}(\mathbf {X}^{\otimes 2}|\mathbf {X}^T\varvec{\theta }=u)\), \(\mathrm {E}(\mathbf {Z}^{\otimes 2}|\mathbf {X}^T\varvec{\theta }=u)\) and  \(\mathrm {E}(\mathbf {X}\mathbf {Z}^T|\mathbf {X}^T\varvec{\theta }=u)\) are consistently Lipschitz continuous with respect to \(u\in \mathcal {U}\) and \(\varvec{\theta }\in \Theta \), where \(A^{\otimes 2}=AA^T\),  A is matrix or vector;

  6. A.6

    The bandwidth h satisfies \(h\sim n^{-\delta }\), where \(1/6<\delta <1/4\);

  7. A.7

    \(\forall u\in \mathcal {U}\) and \(\varvec{\theta }\in \Theta \), the matrix \(\mathrm {E}(\mathbf {Z}^{\otimes 2}|\mathbf {X}^T\varvec{\theta }=u)\) is invertible;

  8. A.8

    \(\forall \varvec{\theta }\in \Theta \), the matrix \(\mathcal {G}\) defined in Theorem 1 is positive definite.

Remark 5

The above conditions are commonly used in the semi-parametric literature and they can be easily satisfied in many applications. Condition A.1 simply requires that the kernel function is a proper density with finite second moment, which is required to derive the asymptotic variance of estimators. Condition A.2 guarantees the existence of any ratio terms with the density appearing as part of the denominator. Conditions A.3 and A.4 are commonly used in single-index model and quantile regression literature, see Wu et al. (2010), Kai et al. (2011) and Xue and Pang (2013). Condition A.5 list some common assumptions in semi-parametric model, see for example Huang and Zhang (2012), Kai et al. (2011) and Xue and Pang (2013). Condition A.6 admits the optimal bandwidth in nonparametric estimation. Condition A.7 comes from Lu et al. (2007) and Kai et al. (2011). Condition A.8 is used to derive the consistence of the variable selection method.

The following two lemmas will be frequently used in our proof.

Lemma 1

Suppose \(A_n(s)\) is convex and can be represented as \(\frac{1}{2}s^TVs+U_n^Ts+C_n+r_n(s)\), where V is symmetric and positive definite, \(U_n\) is stochastically bounded, \(C_n\) is arbitrary, and \(r_n(s)\) goes to zero in probability for each s. Then the argmin of \(A_n\) is only \(o_p(1)\) away from \(\beta _n=-V^{-1}U_n\), the argmin of \(\frac{1}{2}s^TVs+U_n^Ts+C_n\).

Proof

This lemma comes from the Basic proposition in Hjort and Pollard (1993). \(\square \)

Lemma 2

Let \((U_1,Y_1),\ldots ,(U_n,Y_n)\) be independent and identically distributed random vectors, where \(Y_i\) and \(U_i\) are scalar random variable. Assume further that \(\mathrm {E}|Y|^s<\infty \) and \(\sup \limits _u \int |y|^sf(u,y)\mathrm {d}y<\infty \), where \(f(\cdot ,\cdot )\) denotes the joint density of (UY). Let \(K(\cdot )\) be a bounded positive function with a bounded support and satisfying a Lipschitz condition. Then

$$\begin{aligned} \sup \limits _{u\in \mathscr {U}}\left| \frac{1}{n}\mathop \sum \limits _{i=1}^n[K_h(U_i-u)Y_i -\mathrm {E}(K_h(U_i-u)Y_i)]\right| =O_p\left[ \left( \frac{\ln (1/h)}{nh} \right) ^{1/2}\right] , \end{aligned}$$

provided that \(n^{2\varepsilon -1}h\rightarrow \infty \) for some \(\varepsilon <1-s^{-1}\), where \(\mathscr {U}\) is the compact support of U.

Proof

This follows from the result by Mack and Silverman (1982). \(\square \)

Let \(\tilde{\varvec{\theta }}\) be the initial consistency estimate of parameter \(\varvec{\theta }\), which can be obtained using existing methods, see Remark 1. In the following, we assume \(\tilde{\varvec{\theta }}-\varvec{\theta }=o_p(1)\). Denote \(\delta _n=\left[ {\ln (1/h) }/{nh} \right] ^{1/2}\), \(\tau _n=h^2+\delta _n\), \(\delta _{\varvec{\theta }}=\Vert \tilde{\varvec{\theta }}-\varvec{\theta }\Vert \) and \(K_{ih}^{\varvec{\theta }}=K_{i,h}^{\varvec{\theta }}(\mathbf {x})=K_h(\mathbf {X}_{i0}^T\varvec{\theta })\), where \(\mathbf {X}_{i0}=\mathbf {X}_i-\mathbf {x}\). Then we have the following Lemma 3.

Lemma 3

Assume \(\mathbf {x}\) as the interior point of \(\Xi \), denote

$$\begin{aligned}&S_l(\mathbf {x})=\frac{1}{n}\mathop \sum \limits _{i=1}^nK_{ih}^{\tilde{\varvec{\theta }}} \mathbf {Z}_i\mathbf {Z}_i^T \left( \frac{\mathbf {X}_{i0}^T\tilde{\varvec{\theta }}}{h} \right) ^l,\quad l=0,1,2,\\&E_l(\mathbf {x})=\frac{1}{n}\mathop \sum \limits _{i=1}^n K_{ih}^{\tilde{\varvec{\theta }}} \mathbf {Z}_i\mathbf {Z}_i^T \left( \frac{ \mathbf {X}_i-\mathbf {x}}{h} \right) ^{\otimes l},\quad l=1,2, \end{aligned}$$

then we have

$$\begin{aligned} S_0(\mathbf {x})&=\pi _{\tilde{\varvec{\theta }}}(\mathbf {x})f_{\mathcal {U}}(\mathbf {x}^T\tilde{\varvec{\theta }})+O(h^2+\delta _n),\nonumber \\&=\pi _{\varvec{\theta }}(\mathbf {x})f_{\mathcal {U}}(\mathbf {x}^T\tilde{\varvec{\theta }})+O(h^2+\delta _{\varvec{\theta }}+\delta _n),\nonumber \\ S_1(\mathbf {x})&=O(h+h\delta _{\varvec{\theta }}+\delta _n),\nonumber \\ S_2(\mathbf {x})&=\mu _2\pi _{\varvec{\theta }}(\mathbf {x})f_{\mathcal {U}} (\mathbf {x}^T\tilde{\varvec{\theta }})+O(h^2+\delta _{\varvec{\theta }}+\delta _n),\nonumber \\ E_1(\mathbf {x})&=f_{\mathcal {U}}(\mathbf {x}^T\tilde{\varvec{\theta }}) \pi _{\varvec{\theta }}(\mathbf {x})(\mu _{\varvec{\theta }}(\mathbf {x})-\mathbf {x})+O(h^2+\delta _{\varvec{\theta }}+\delta _n),\nonumber \\ E_2(\mathbf {x})&=2f_{\mathcal {U}}(\mathbf {x}^T\tilde{\varvec{\theta }}) \pi _{\varvec{\theta }}(\mathbf {x}) \Sigma _{\varvec{\theta }}(\mathbf {x})+O(h^2+\delta _{\varvec{\theta }}+\delta _n), \end{aligned}$$

where \(\mu _{\varvec{\theta }}(\mathbf {x})=\mathrm {E}(X|\mathbf {X}^T\varvec{\theta }=\mathbf {x}^T\varvec{\theta })\),  \(\nu _{\varvec{\theta }}(\mathbf {x})=\mathrm {E}(Z|\mathbf {X}^T\varvec{\theta }=\mathbf {x}^T\varvec{\theta })\),  \(\pi _{\varvec{\theta }}(\mathbf {x})=\mathrm {E}(\mathbf {Z}\mathbf {Z}^T|\mathbf {X}^T\varvec{\theta }=\mathbf {x}^T\varvec{\theta })\),  \(\Sigma _{\varvec{\theta }}(\mathbf {x})=\mathrm {E}\left( (\mathbf {X}-\mu _{\varvec{\theta }}(\mathbf {x}))(\mathbf {X}-\mu _{\varvec{\theta }}(\mathbf {X}))^T|\mathbf {X}^T\varvec{\theta }=\mathbf {x}^T\varvec{\theta } \right) \).

Proof

By the Condition  2, after some direct calculations, we can easily obtain the above conclusions. \(\square \)

Lemma 4

For the given interior point \(\mathbf {x}\) of \(\mathbf {X}\), then the estimates of \(\mathbf {g}(\mathbf {x}^T\tilde{\varvec{\theta }})\) and \(\mathbf {g}'(\cdot )\) are

$$\begin{aligned} (\hat{\mathbf {g}}(\mathbf {x}^T\tilde{\varvec{\theta }}),\hat{\mathbf {g}}'(\mathbf {x}^T \tilde{\varvec{\theta }}))=\mathop {\mathrm{argmin}}\limits _{\mathbf {a},\mathbf {b}} \sum \limits _{i=1}^n\rho _{\tau }\left( Y_i-\left( \mathbf {a}+\mathbf {b}\mathbf {X}_{i0}^T \tilde{\varvec{\theta }}\right) ^T\mathbf {Z}_i)K(\mathbf {X}_{i0}^T\tilde{\varvec{\theta }}/h\right) . \end{aligned}$$

Under the conditions A.1–A.7, we have

$$\begin{aligned} \hat{\mathbf {g}}(\mathbf {x}^T\tilde{\varvec{\theta }})= & {} \mathbf {g}(\mathbf {x}^T\tilde{\varvec{\theta }}) +\frac{1}{2}\mathbf {g}''(\mathbf {x}^T\tilde{\varvec{\theta }})\mu _2h^2- \mathbf {g}'(\mathbf {x}^T\tilde{\varvec{\theta }})\mu _{\varvec{\theta }} (\mathbf {x})^T\varvec{\theta }_d\nonumber \\&+\,R_{n1}^{\tilde{\varvec{\theta }}}\left( \mathbf {x})+O(h^2(h^2+\delta _{\varvec{\theta }}+\delta _n)+\delta _{\varvec{\theta }}^2\right) , \nonumber \\ \hat{\mathbf {g}}'(\mathbf {x}^T\tilde{\varvec{\theta }})= & {} \mathbf {g} '(\mathbf {x}^T\tilde{\varvec{\theta }})+\frac{1}{h}R_{n2}^{\tilde{\varvec{\theta }}}(\mathbf {x})+O(h^2+\delta _n+\delta _{\varvec{\theta }}), \end{aligned}$$

where \(\varvec{\theta }_d=\tilde{\varvec{\theta }}-\varvec{\theta }\), \(\mathbf {X}_{i0}=\mathbf {X}_i-\mathbf {x}\),  \(\psi _{\tau }(u)=\tau -I(u<0)\),

$$\begin{aligned} R_{n1}^{\varvec{\theta }}(\mathbf {x})&=[nf_Y(q_{\tau }(\mathbf {x},\mathbf {z})| \mathbf {x}^T\varvec{\theta })f_{\mathcal {U}}(\mathbf {x}^T\varvec{\theta })]^{-1}\pi _{\varvec{\theta }}(\mathbf {x})^{-1} \sum \limits _{i=1}^nK_{i,h}^{\varvec{\theta }}\psi _{\tau }(\varepsilon _i),\\ R_{n2}^{\varvec{\theta }}(\mathbf {x})&=[nh\mu _2f_Y(q_{\tau }(\mathbf {x},\mathbf {x}) |\mathbf {x}^T\varvec{\theta })f_{\mathcal {U}}(u)]^{-1} \pi _{\varvec{\theta }}(\mathbf {x})^{-1} \sum \limits _{i=1}^n K_{i,h}^{\varvec{\theta }}\psi _{\tau }(\varepsilon _i) {\mathbf {X}_{i0}^T\varvec{\theta }}. \end{aligned}$$

In particular, \(\sup \limits _{\mathbf {x}\in \Xi } \Vert \hat{\mathbf {g}}'(\mathbf {x}^T\tilde{\varvec{\theta }})-\mathbf {g}'(\mathbf {x}^T\tilde{\varvec{\theta }} )\Vert =O(h^2+h^{-1}\delta _n+\delta _{\varvec{\theta }})\) holds.

Proof

For notation simplicity, let \(\mathbf {x}^T\tilde{\varvec{\theta }}=u\), denote

$$\begin{aligned} \eta =\sqrt{nh} \left( {\begin{array}{c} \mathbf {a}-\mathbf {g}(u) \\ h(\mathbf {b}-\mathbf {g}'(u)) \end{array}}\right) , \hat{\eta }_n=\sqrt{nh}\left( {\begin{array}{c} \hat{\mathbf {g}}(u)-\mathbf {g}(u)\\ h(\hat{\mathbf {g}}'(u)-\mathbf {g}'(u)) \end{array}} \right) ,~ M_i=\left( {\begin{array}{c} \mathbf {Z}_i \\ \mathbf {Z}_i \mathbf {X}_{i0}^T\tilde{\varvec{\theta }}/h \end{array}}\right) \end{aligned}$$

and

$$\begin{aligned} r_i(u)=\left[ -\mathbf {g}(\mathbf {X}_i^T\varvec{\theta })+\mathbf {g}(u)+ \mathbf {g}'(u)\mathbf {X}_{i0}^T\tilde{\varvec{\theta }}\right] ^T\mathbf {Z}_i,~K_i=K\left( \mathbf {X}_{i0}^T\tilde{\varvec{\theta }}/h\right) . \end{aligned}$$

Then \(\hat{\eta }_n\) is the minimizer of the following object function

$$\begin{aligned} Q_n(\eta )=\sum \limits _{i=1}^n \left[ \rho _{\tau }\left( \varepsilon _i-r_i(u) -\eta ^TM_i/\sqrt{nh}\right) -\rho _{\tau }(\varepsilon _i-r_i(u)) \right] K_i. \end{aligned}$$

By the identify equation in Knight (1998),

$$\begin{aligned} \rho _{\tau }(u-v)-\rho _{\tau }(u)=-v\psi _{\tau }(u)+\int _0^v(I(u\le s)-I(u\le 0)\mathrm{d}s, \end{aligned}$$
(16)

it follows that \(Q_n(\eta )\) can be restated as

$$\begin{aligned} Q_n(\eta )&= \frac{1}{\sqrt{nh}}\sum \limits _{i=1}^nK_i M_i\psi _{\tau }(\varepsilon _i)+\sum \limits _{i=1}^nK_i \int _{r_i(u)}^{r_i(u)+M_i^T\eta /\sqrt{nh}}(I(\varepsilon _i\le s)-I(\varepsilon _i) \le 0))\mathrm{d}s,\nonumber \\&\equiv -\eta ^T W_n +B_n(\eta ), \end{aligned}$$
(17)

where \(W_n=\frac{1}{\sqrt{nh}}\sum \limits _{i=1}^nK_iM_i\psi _{\tau }(\varepsilon _i)\),

$$\begin{aligned} B_n(\eta )=\sum \limits _{i=1}^nK_i\int _{r_i(u)}^{r_i(u)+M_i^T\eta /\sqrt{nh}}\left[ I(\varepsilon _i\le s)-I(\varepsilon _i \le 0 )\right] \mathrm{d}s. \end{aligned}$$

We next consider \(B_n(\eta )\). Denote \(\tilde{\mathcal {X}}\) as the \(\sigma \) field generated by \(\{\mathbf {X}_1^T\tilde{\varvec{\theta }},\mathbf {X}_2^T\tilde{\varvec{\theta }},\ldots ,\mathbf {X}_n^T\tilde{\varvec{\theta }} \}\). Take the conditional expectation of \(B_n(\eta )\), we have

$$\begin{aligned} \mathrm {E} \left( B_n(\eta )\bigl | \tilde{\mathcal {X}} \right)&= \sum \limits _{i=1}^{n} K_i\int _{r_i(u)}^{r_i(u)+M_i^T\eta /\sqrt{nh}} \mathrm {E} \left( I(\varepsilon _i\le s)-I(\varepsilon _i \le 0)|\mathbf {X}_i^T\tilde{\varvec{\theta }} \right) \mathrm{d}s\nonumber \\&=\frac{1}{2} f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u) \eta ^T \left( \frac{1}{nh}\sum \limits _{i=1}^nM_i{M_i}^TK_i \right) \eta \nonumber \\&\quad +\left( \frac{f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u)}{\sqrt{nh}}\sum \limits _{i=1}^n K_ir_i(u)M_i \right) ^T\eta +o_p(1)\nonumber \\&\equiv B_{n1}(\eta )+B_{n2}(\eta )+o_p(1), \end{aligned}$$

where \( B_{n1}(\eta ) =\frac{1}{2} f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u) \eta ^T \left( \frac{1}{nh}\sum \limits _{i=1}^nM_i{M_i}^TK_i \right) \eta \),

$$\begin{aligned} B_{n2}(\eta )=\left( \frac{f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u)}{\sqrt{nh}}\sum \limits _{i=1}^n K_ir_i(u)M_i \right) ^T\eta +o_p(1). \end{aligned}$$

We next calculate \(\mathrm {Var}(B_n(\eta )| \tilde{\mathcal {X}})\). Denote

$$\begin{aligned} \Delta _i=M_i^T\eta /\sqrt{nh}= \left[ \mathbf {a}-\mathbf {g}(u)+h(\mathbf {b}-\mathbf {g}'(u))(\mathbf {X}_i^T\tilde{\theta }-u) \right] ^T\mathbf {Z}_i. \end{aligned}$$

Since

$$\begin{aligned} \mathrm {Var} \left[ B_n(\eta )|\tilde{\chi } \right]&=\sum \limits _{i=1}^n \mathrm{Var} \left\{ \left( K_i \int _{r_i(u)}^{r_i(u)+\Delta _i} \left[ I \left\{ \varepsilon _i\le s \right\} -I \left\{ \varepsilon \le 0 \right\} \right] \mathrm{d}s \right) \big |\tilde{\chi } \right\} \\&=\sum \limits _{i=1}^n \mathrm{Var} \left\{ \left( K_i \int _0^{\Delta _i} \left[ I \left\{ \varepsilon _i \le r_i(u)+t \right\} -I \left\{ \varepsilon \le r_i(u) \right\} \right] \mathrm{d}t \right) \big |\tilde{\chi } \right\} \\&\le \sum \limits _{i=1}^n \mathrm {E} \left[ \left( K_i \int _0^{\Delta _i} \left[ I \left\{ \varepsilon _i\le r_i(u)+t \right\} -I \left\{ \varepsilon \le r_i(u) \right\} \right] \mathrm{d}t \right) ^2 \big | \tilde{\chi } \right] \\&\le \sum \limits _{i=1}^n K_i^2 \int _0^{|\Delta _i|}\int _0^{|\Delta _i|} \left[ F(r_i(u)+|\Delta _i|) -F(r_i(u)) \right] \mathrm{d}v_1\mathrm{d}v_2\\&=o \left( \sum \limits _{i=1}^n K_i^2 \Delta _i^2 \right) =o_p(1). \end{aligned}$$

Therefore, we have \(\mathrm {Var}(B_n(\eta )| \tilde{\mathcal {X}})=o(1)\), and it follows that

$$\begin{aligned} B_n(\eta )= B_{n1}(\eta )+B_{n2}(\eta )+o_p(1). \end{aligned}$$
(18)

Denote \( \mathbb {S}_n=\frac{1}{nh}f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u)\sum \limits _{i=1}^n M_i{M_i}^T K_i~\). By the above Lemma 3, it is easy to prove \(\mathbb {S}_n=\mathbb {S}+O_p(\tau _n+\delta _{\varvec{\theta }})\), where

$$\begin{aligned} \mathbb {S}=f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u)f_{\mathcal {U}}(u)\mathrm {E}(\mathbf {Z}\mathbf {Z}^T|\mathbf {X}^T\varvec{\theta })\otimes dia\mathbf {g}(1,\mu _2), \end{aligned}$$

and \(A\otimes B\) denotes the Kronecker product of two matrixes.

Combining the above results, we have

$$\begin{aligned} Q_{n1}(\eta )=\frac{1}{2}\eta ^T\mathbb {S}\eta +o_p(1). \end{aligned}$$
(19)

Now we begin to consider \(B_{n2}(\eta )\). Note that

$$\begin{aligned} r_i(u)=\left( \mathbf {X}_i^T\varvec{\theta }_d\mathbf {g}\left( \mathbf {X}_i^T\tilde{\varvec{\theta }}\right) -\frac{1}{2}\mathbf {g}''(u)\left( \mathbf {X}_{i0}^T\tilde{\varvec{\theta }}\right) ^2 +O\left( \varvec{\theta }_d^2+ \left( \mathbf {X}_{i0}^T\tilde{\varvec{\theta }}\right) ^3\right) \right) ^T\mathbf {Z}_i, \end{aligned}$$

hence it follows that

$$\begin{aligned} \frac{1}{\sqrt{nh}}\sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u)\mathbf {Z}_iK_ir_i(u)&= \sqrt{nh}\mathrm {E}(\mathbf {Z}\mathbf {Z}^T|\mathbf {X}^T\varvec{\theta }) f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u)f_{\mathcal {U}}(u)\nonumber \\&\quad \times \Bigg ( \mathbf {g}'(u)\mu _{\varvec{\theta }}(\mathbf {x})^T\varvec{\theta }_d -\frac{1}{2}\mathbf {g}''(u)\mu _2h^2\nonumber \\&\quad +O(h^4+\delta ^2_{\varvec{\theta }}+h^2\delta _{\varvec{\theta }})\Bigg ), \nonumber \\ \frac{1}{\sqrt{nh}}\sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u)K_i \frac{\mathbf {X}_{i0}^T\tilde{\varvec{\theta }}}{h} \mathbf {Z}_ir_i(u)&=\sqrt{nh}[O(h^3+h\delta _{\varvec{\theta }})]. \end{aligned}$$
(20)

Combining the results from (17), (18), (19) and (20), we have

$$\begin{aligned} Q_n(\eta )&=\frac{1}{2} \eta ^T \mathbb {S} \eta -W_n^T\eta +\sqrt{nh}f_Y(q_{\tau }(\mathbf {x},\mathbf {z})|u)f_{\mathcal {U}}(u)\nonumber \\&\quad \times \left( {\begin{array}{c} \mathrm {E}(\mathbf {Z}\mathbf {Z}^T|\mathbf {X}^T\varvec{\theta }) \left[ \mathbf {g}'(u)\mu _{\varvec{\theta }}(\mathbf {x})^T\varvec{\theta }_d -\frac{1}{2} \mathbf {g}''(u)\mu _2h^2 +O(h^4+\delta ^2_{\varvec{\theta }}+h^2\delta _{\varvec{\theta }}) \right] \\ O(h^3+h\delta _{\varvec{\theta }}) \end{array}} \right) ^T \eta +o_p(1). \end{aligned}$$

By the result of  (), the minimizer of \(Q_n(\eta )\) can be expressed as

$$\begin{aligned} \hat{\eta }_n=\mathbb {S}^{-1}W_n-\sqrt{nh}\left( {\begin{array}{c} \mathbf {g}'(u)\mu _{\varvec{\theta }}(\mathbf {x})^T\varvec{\theta }_d - \frac{1}{2}\mathbf {g}''(u)\mu _2h^2 +O(\delta ^2_{\varvec{\theta }}+(h^2+\delta _{\varvec{\theta }})\tau _n) \\ O(h^3+h\delta _{\varvec{\theta }}) \end{array}}\right) +o_p(1). \end{aligned}$$

According to the definition of \(\hat{\eta }_n\) and \(W_n\), the result of the first part follows. Meanwhile, by the Lemma 2, the second part also follows. \(\square \)

Proof of Theorem 1

Given the estimates \(\hat{\mathbf {g}}(\mathbf {X}_j^T\tilde{\varvec{\theta }}),~\hat{\mathbf {g}}'(\mathbf {X}_j^T\tilde{\varvec{\theta }})\) of \(\mathbf {g}(\mathbf {X}_j^T\tilde{\varvec{\theta }})\) and \(\mathbf {g}'(\mathbf {X}_j^T\tilde{\varvec{\theta }})\), \(j=1,\ldots ,n\), by (6), the estimate \(\varvec{\theta }\) can be obtained as

$$\begin{aligned} \hat{\varvec{\theta }}=\mathop {\mathrm{argmin}}\limits _{\Vert \varvec{\theta }\Vert =1,\varvec{\theta }_1>0} \sum \limits _{j=1}^n \sum \limits _{i=1}^n\rho _{\tau } \left( Y_i-[\hat{\mathbf {g}}(\mathbf {X}_j^T\tilde{\varvec{\theta }})+\hat{\mathbf {g}}'(\mathbf {X}_j^T \tilde{\varvec{\theta }})\mathbf {X}_{ij}^T\varvec{\theta }]^T\mathbf {Z}_i \right) \omega _{ij}. \end{aligned}$$

Denote \(\tilde{U}_i=\mathbf {X}_i^T\tilde{\varvec{\theta }},~\tilde{U}_j=\mathbf {X}_j^T\tilde{\varvec{\theta }}\). Let

$$\begin{aligned}&\hat{\varvec{\theta }}^{*}=\sqrt{n}\left( \hat{\varvec{\theta }}-\varvec{\theta } \right) ,~ M_{ij}= \mathbf {Z}_i^T\hat{\mathbf {g}}'(\tilde{U}_j)\mathbf {X}_{ij},\\&r_{ij}=\left( -\mathbf {g}(\mathbf {X}_i^T\varvec{\theta })+\hat{\mathbf {g}}(\tilde{U}_j)+ \hat{\mathbf {g}}'(\tilde{U}_j)\mathbf {X}_{ij}\varvec{\theta }\right) ^T\mathbf {Z}_i, \end{aligned}$$

then \(\hat{\varvec{\theta }}^{*}\) is the minimizer of

$$\begin{aligned} \mathcal {Q}_n(\varvec{\theta }^{*})= \sum \limits _{j=1}^n \sum \limits _{i=1}^n\omega _{ij} \left[ \rho _{\tau }\left( \varepsilon _i- r_{ij}-M_{ij}^T\varvec{\theta }^{*}/\sqrt{n} \right) -\rho _{\tau }(\varepsilon _i-r_{ij}) \right] . \end{aligned}$$

By Knight (1998) identify Eq. (16), we can rewritten \(\mathcal {Q}_n(\varvec{\theta }^{*})\) as

$$\begin{aligned} \mathcal {Q}_n(\varvec{\theta }^{*})&=-\frac{1}{\sqrt{n}}\sum \limits _{j=1}^n\sum \limits _{i=1}^n\omega _{ij} \psi _{\tau }(\varepsilon _i)M_{ij}^T\varvec{\theta }^{*} \nonumber \\&\quad + \sum \limits _{j=1}^n \sum \limits _{i=1}^n\omega _{ij}\int _{r_{ij}}^{r_{ij}+M_{ij}^T\varvec{\theta }^{*}/\sqrt{n}} [ I(\varepsilon _i\le s)-I(\varepsilon _i\le 0)]\mathrm{d}s\nonumber \\&\equiv \mathcal {Q}_{1n}(\varvec{\theta }^{*})+\mathcal {Q}_{2n}(\varvec{\theta }^{*}), \end{aligned}$$

where \( \mathcal {Q}_{1n}(\varvec{\theta }^{*})= -\frac{1}{\sqrt{n}}\sum \limits _{j=1}^n\sum \limits _{i=1}^n\omega _{ij} \psi _{\tau }(\varepsilon _i)M_{ij}^T\varvec{\theta }^{*} \),

$$\begin{aligned} \mathcal {Q}_{2n}(\varvec{\theta }^{*})=&\sum \limits _{j=1}^n \sum \limits _{i=1}^n \omega _{ij}\int _{r_{ij}}^{r_{ij}+M_{ij}^T\varvec{\theta }^{*}/\sqrt{n}} ( I(\varepsilon _i\le s)-I(\varepsilon _i\le 0))\mathrm{d}s. \end{aligned}$$

Firstly, we consider the conditional expectation of \(\mathcal {Q}_{2n}(\varvec{\theta }^{*})\) on \( \tilde{\mathcal {X}}\). By directly calculating, we have

$$\begin{aligned} \mathrm {E} \left( \mathcal {Q}_{2n}(\varvec{\theta }^{*})\bigl | \tilde{\mathcal {X}} \right)&=\sum \limits _{j=1}^n \sum \limits _{i=1}^n\int _{r_{ij}}^{r_{ij}+M_{ij}^T\varvec{\theta }^{*}/\sqrt{n}} \omega _{ij}\left[ s f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)(1+o(1)) \right] \mathrm{d}s\\&=\frac{1}{2} \varvec{\theta }^{*T} \left( \frac{1}{n} \sum \limits _{j=1}^n \sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i) M_{ij}M_{ij}^T\omega _{ij} \right) \varvec{\theta }^{*}\\&\quad +\left( \frac{1}{\sqrt{n}} \sum \limits _{j=1}^n \sum \limits _{i=1}^n\omega _{ij} f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)r_{ij}M_{ij} \right) ^T\varvec{\theta }^{*}+o_p(1)\\&\equiv \mathcal {Q}_{2n1}(\varvec{\theta }^{*})+\mathcal {Q}_{2n2}(\varvec{\theta }^{*})+o_p(1), \end{aligned}$$

where \(\mathcal {Q}_{2n1}(\varvec{\theta }^{*})=\frac{1}{2} \varvec{\theta }^{*T} \left( \frac{1}{n} \sum \limits _{j=1}^n \sum \limits _{i=1}^n\omega _{ij} f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i) M_{ij}M_{ij}^T \right) \varvec{\theta }^{*}\),

$$\begin{aligned} \mathcal {Q}_{2n2}(\varvec{\theta }^{*}) =&\left( \frac{1}{\sqrt{n}} \sum \limits _{j=1}^n \sum \limits _{i=1}^n\omega _{ij} f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)M_{ij}r_{ij} \right) ^T\varvec{\theta }^{*}+o_p(1). \end{aligned}$$

Denote \(\mathcal {R}_n(\varvec{\theta }^{*})=\mathcal {Q}_{2n}(\varvec{\theta }^{*}) -\mathrm {E}(\mathcal {Q}_{2n}(\varvec{\theta }^{*})|\tilde{\mathcal {X}})\). It is easy to obtain \(\mathcal {R}_n(\varvec{\theta }^{*})=o_p(1)\), then we have \( \mathcal {Q}_{2n}(\varvec{\theta }^{*})= \mathcal {Q}_{2n1}(\varvec{\theta }^{*})+\mathcal {Q}_{2n2}(\varvec{\theta }^{*})+o_p(1)\).

Next, we consider \(\mathcal {Q}_{2n1}(\varvec{\theta }^{*})\) and \(\mathcal {Q}_{2n2}(\varvec{\theta }^{*})\), respectively. For \(\mathcal {Q}_{2n1}(\varvec{\theta }^{*})\), let

$$\begin{aligned} \mathcal {G}_n^{\tilde{\varvec{\theta }}}=\frac{1}{n} \sum \limits _{j=1}^n \sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)M_{ij}M_{ij}^T\omega _{ij}. \end{aligned}$$

By the Lemma  2, it is easy to have \(\mathcal {G}_n^{\tilde{\varvec{\theta }}}=2\mathcal {G}+O(h^2+\delta _n+\delta _{\varvec{\theta }})\), where the definition of \(\mathcal {G}\) can be seen in Theorem  1.

Denote \(W_{\varvec{\theta }}(\mathbf {x})=\mathrm {E}(f_Y(q_{\tau }(\mathbf {X},\mathbf {Z})|\mathbf {X}^T\varvec{\theta })\mathbf {ZZ}^T|\mathbf {X}^T\varvec{\theta }=\mathbf {x}^T\varvec{\theta })\), then

$$\begin{aligned} \mathcal {Q}_{2n1}(\varvec{\theta }^{*})=\frac{1}{2} \varvec{\theta }^{*T} \mathcal {G} \varvec{\theta }^{*} +o_p(1). \end{aligned}$$
(21)

For \(\mathcal {Q}_{2n2}(\varvec{\theta }^{*})\), note that

$$\begin{aligned} r_{ij}&= \mathbf {Z}_i^T\left( \mathbf {g}'(\tilde{U}_i)\mathbf {X}_i^T\varvec{\theta }_d-\frac{1}{2}\mathbf {g}''(\tilde{U}_j) \left( \mathbf {X}_{ij}^T\tilde{\varvec{\theta }}\right) ^2-\hat{\mathbf {g}}'(\tilde{U}_j)\mathbf {X}_{ij}^T\varvec{\theta }_d\right) \nonumber \\&\quad +\mathbf {Z}_i^T \left( \hat{\mathbf {g}}(\tilde{U}_j)-\mathbf {g}(\tilde{U}_j)+ (\hat{\mathbf {g}}'(\tilde{U}_j)-\mathbf {g}'(\tilde{U}_j))\mathbf {X}_{ij}^T \tilde{\varvec{\theta }}+O\left( \varvec{\theta }_d^2+\left( \mathbf {X}_{ij}^T\tilde{\varvec{\theta }}\right) ^3\right) \right) . \end{aligned}$$

Hence, we obtain

$$\begin{aligned} \mathcal {Q}_{2n2}(\varvec{\theta }^{*})&= \frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)\omega _{ij}M_{ij}^T\varvec{\theta }^{*} (\mathbf {Z}_i^T, \mathbf {Z}_i^T\mathbf {X}_{ij}^T\tilde{\varvec{\theta }}/h) \left( {\begin{array}{c} \hat{\mathbf {g}}(\tilde{U}_j)-\mathbf {g}(\tilde{U}_j) \\ h(\hat{\mathbf {g}}'(\tilde{U}_j)-\mathbf {g}'(\tilde{U}_j)) \end{array}} \right) \\&\quad +\frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)\omega _{ij}M_{ij}^T\varvec{\theta }^{*} \mathbf {Z}_i^T \\&\quad \times \left( \mathbf {g}'(\tilde{U}_i)\mathbf {X}_i^T\varvec{\theta }_d-\hat{\mathbf {g}}'(\tilde{U}_j)\mathbf {X}_{ij}^T\varvec{\theta }_d-\frac{1}{2} \mathbf {g}''(\tilde{U}_j)(\mathbf {X}_{ij}^T\tilde{\varvec{\theta }})^2\right) \\&\equiv \left( \mathcal {Q}_{2n21}+\mathcal {Q}_{2n22} \right) ^T\varvec{\theta }^{*}+O(\delta _{\varvec{\theta }}^2+h^3), \end{aligned}$$

where

$$\begin{aligned} \mathcal {Q}_{2n21}&= \frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)\omega _{ij}M_{ij} \left( \mathbf {Z}_i^T, \mathbf {Z}_i^T\mathbf {X}_{ij}^T\tilde{\varvec{\theta }}/h\right) \left( {\begin{array}{c} \hat{\mathbf {g}}(\tilde{U}_j)-\mathbf {g}(\tilde{U}_j) \\ h(\hat{\mathbf {g}}'(\tilde{U}_j)-\mathbf {g}'(\tilde{U}_j)) \end{array}}\right) ,\\ \mathcal {Q}_{2n22}&=\frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)\omega _{ij}M_{ij} \mathbf {Z}_i^T, \\&\quad \times \left( \mathbf {g}'(\tilde{U}_i)\mathbf {X}_i^T\varvec{\theta }_d-\hat{\mathbf {g}}'(\tilde{U}_j)\mathbf {X}_{ij}^T\varvec{\theta }_d-\frac{1}{2} \mathbf {g}''(\tilde{U}_j)(\mathbf {X}_{ij}^T\tilde{\varvec{\theta }})^2\right) . \end{aligned}$$

Now, we begin to consider \(\mathcal {Q}_{2n21}\) and \(\mathcal {Q}_{2nn2}\). By the asymptotic expressions \(\hat{\mathbf {g}}(\mathbf {x}^T\tilde{\varvec{\theta }})\) and \(\hat{\mathbf {g}}'(\mathbf {X}^T\tilde{\varvec{\theta }})\) obtained in Lemma  4, we have

$$\begin{aligned} \mathcal {Q}_{2n21}&=\frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n \omega _{ij} f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)M_{ij} \left( \mathbf {Z}_i^T,\mathbf {X}_{ij}^T\tilde{\varvec{\theta }}\mathbf {Z}_i^T\right) \left( {\begin{array}{c} R_{n1}^{\tilde{\varvec{\theta }}}(\mathbf {X}_j) \\ R_{n2}^{\tilde{\varvec{\theta }}}(\mathbf {X}_j) \end{array}}\right) \\&\quad + \frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n \omega _{ij} f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)M_{ij} \mathbf {Z}_i^T \\&\quad \times \left( \frac{1}{2}\mathbf {g}''(\tilde{U}_j)\mu _2h^2-\mathbf {g}'(\tilde{U}_j)\mu _{\varvec{\theta }}(\mathbf {X}_j)^T\varvec{\theta }_d \right) \\&\quad +O_p((h^2+\delta _{\varvec{\theta }})\tau _n+\delta _{\varvec{\theta }}^2+h^3+h\delta _{\varvec{\theta }})\\&\equiv T_1+T_2+o_p(1), \end{aligned}$$

where

$$\begin{aligned} T_1= & {} \frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n \omega _{ij} f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)M_{ij} \left( \mathbf {Z}_i^T,\mathbf {X}_{ij}^T\tilde{\varvec{\theta }}\mathbf {Z}_i^T\right) \left( {\begin{array}{c} R_{n1}^{\tilde{\varvec{\theta }}}(\mathbf {X}_j) \\ R_{n2}^{\tilde{\varvec{\theta }}}(\mathbf {X}_j) \end{array}}\right) ,\\ T_2= & {} \frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n \omega _{ij} f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)M_{ij} \mathbf {Z}_i^T \\&\quad \times \left( \frac{1}{2}\mathbf {g}''(\tilde{U}_j)\mu _2h^2-\mathbf {g}'(\tilde{U}_j)\mu _{\varvec{\theta }}(\mathbf {X}_j)^T\varvec{\theta }_d \right) . \end{aligned}$$

By directly calculating, it follows that

$$\begin{aligned} T_1&= \frac{1}{\sqrt{n}} \sum \limits _{j=1}^n \sum \limits _{i=1}^n \frac{\omega _{ij} f_Y(q_{\tau }(\mathbf {X}_i,\mathbf {Z}_i)|\tilde{U}_i)}{nf_{\mathcal {U}}(\tilde{U}_j)f_Y(q_{\tau }(\mathbf {X}_j,\mathbf {Z}_j)|\tilde{U}_j)} M_{ij} \left( \mathbf {Z}_i^T,~\mathbf {Z}_i^T \frac{\mathbf {X}_{ij}^T\tilde{\varvec{\theta }}}{h} \right) W_{\tilde{\varvec{\theta }}}(\mathbf {X}_j)^{-1} \\&\quad \times \sum \limits _{k=1}^n \left( \begin{array}{c} \mathbf {Z}_k\\ \mathbf {Z}_k \frac{\mathbf {X}_{kj}^T\tilde{\varvec{\theta }}}{h} \end{array} \right) K_h\left( \mathbf {X}_{kj}^T\tilde{\varvec{\theta }}\right) \psi _{\tau }(\varepsilon _k)\\&=\frac{1}{\sqrt{n}} \sum \limits _{k=1}^n \sum \limits _{j=1}^n \psi _{\tau }(\varepsilon _k) \omega _{kj} [\mu _{\varvec{\theta }}(\mathbf {X}_j)-\mathbf {X}_j]\mathbf {g}'(\tilde{U}_j)^T\mathbf {Z}_k +o_p(1). \end{aligned}$$

Combining \(T_1\) and \(\mathcal {Q}_{1n}(\varvec{\theta }^{*})\), we have

$$\begin{aligned} \mathcal {Q}_{1n}(\varvec{\theta }^{*})+T_1^T\varvec{\theta }^{*}&= \left[ - \frac{1}{\sqrt{n}} \sum \limits _{i=1}^n \sum \limits _{j=1}^n \psi _{\tau }(\varepsilon _i) \omega _{ij} \hat{\mathbf {g}}'(\tilde{U}_j) ^T\mathbf {Z}_i\left( \mathbf {X}_i-\mu _{\varvec{\theta }}(\mathbf {X}_j) \right) \right] ^T\varvec{\theta }^{*}+o_p(1)\nonumber \\&= -\sqrt{n}\mathcal {W}_n^T\varvec{\theta }^{*}+o_p(1), \end{aligned}$$
(22)

where \(\mathcal {W}_n=\frac{1}{\sqrt{n}} \sum \limits _{i=1}^n \sum \limits _{j=1}^n \psi _{\tau }(\varepsilon _i) \omega _{ij} \hat{\mathbf {g}}'(\tilde{U}_j) ^T\mathbf {Z}_i\left[ \mathbf {X}_i-\mu _{\varvec{\theta }}(\mathbf {X}_j) \right] \). By the Lemma 2, we obtain

$$\begin{aligned} \mathcal {W}_n=\frac{1}{\sqrt{n}} \sum \limits _{i=1}^n \psi _{\tau }(\varepsilon _i) \mathbf {g}'(\tilde{U}_i)^T \mathbf {Z}_i(\mathbf {X}_i-\mu _{\varvec{\theta }}(\mathbf {X}_i)). \end{aligned}$$
(23)

According to the Cramér–Wald device and the central limit theorem, we have

$$\begin{aligned} \mathcal {W}_n\mathop {\longrightarrow }\limits ^{\mathcal {L}} N(0,\tau (1-\tau )\mathcal {G}_0), \end{aligned}$$
(24)

where the definition of \(\mathcal {G}_0\) is given in Theorem  1.

Merging \(T_2\) and \(\mathcal {Q}_{2n22}\), we obtain

$$\begin{aligned} \mathcal {Q}_{2n22}+T_2&=\frac{1}{\sqrt{n}}\sum \limits _{j=1}^n \sum \limits _{i=1}^n f_Y(q_{\tau }(\mathbf {X}_j,\mathbf {Z}_j)|\tilde{U}_j)\omega _{ij}M_{ij}\mathbf {Z}_i^T \biggl [ \mathbf {g}'(\tilde{U}_i)\mathbf {X}_i^T\varvec{\theta }_d-\hat{\mathbf {g}}'(\tilde{U}_j)\mathbf {X}_{ij}^T\varvec{\theta }_d\\&\quad -\frac{1}{2} \mathbf {g}''(\tilde{U}_j)\left( \mathbf {X}_{ij}^T\tilde{\varvec{\theta }}\right) ^2+\frac{1}{2}\mathbf {g}''(\tilde{U}_j)\mu _2h^2-\mathbf {g}'(\tilde{U}_j)\mu _{\varvec{\theta }}(\mathbf {X}_j)^T\varvec{\theta }_d \biggr ] +o_p(1)\\&=\frac{1}{\sqrt{n}} \sum \limits _{j=1}^n \mathbf {g}'(\tilde{U}_j)^T W_{\tilde{\varvec{\theta }}}(\mathbf {X}_j)\mathbf {g}'(\tilde{U}_j) \left( \mu _{\varvec{\theta }}(\mathbf {X}_j)- \mathbf {X}_j^T\right) \left( \mu _{\varvec{\theta }}(\mathbf {X}_j)- \mathbf {X}_j^T\right) ^T\varvec{\theta }_d\\&\quad +o_p(1). \end{aligned}$$

By Lemmas  2 and 3, it is easy to obtain

$$\begin{aligned} \mathcal {Q}_{2n22}+T_2=-\sqrt{n}\mathcal {G} \varvec{\theta }_d +o_p(1). \end{aligned}$$
(25)

Therefore, by (21), (22) and (25), we have

$$\begin{aligned} \mathcal {Q}_n(\varvec{\theta }^{*})= \varvec{\theta }^{*T} \mathcal {G} \varvec{\theta }^{*} - \left[ \mathcal {W}_n+\sqrt{n} \mathcal {G} \varvec{\theta }_d \right] ^T\varvec{\theta }^{*}+o_p(1). \end{aligned}$$

By the Lemma  1, the minimizer \(\hat{\varvec{\theta }}^{*}\) of \(\mathcal {Q}_n(\varvec{\theta }^{*})\) can be written as \(\hat{\varvec{\theta }}^{*}=\frac{1}{2}\mathcal {G}^{-1}\mathcal {W}_n+\frac{1}{2}\sqrt{n} \varvec{\theta }_d +o_p(1).\) Note that \(\hat{\varvec{\theta }}^{*}=\sqrt{n} \left( \hat{\varvec{\theta }}-\varvec{\theta } \right) \), then we have

$$\begin{aligned} \left( \hat{\varvec{\theta }}-\varvec{\theta } \right) =\frac{1}{2} \mathcal {G}^{-1} \frac{1}{\sqrt{n}}\mathcal {W}_n+\frac{1}{2} \left( \tilde{\varvec{\theta }}-\varvec{\theta } \right) +o_p(1/\sqrt{n}). \end{aligned}$$
(26)

The convergence of the estimate algorithm can be followed by the above equation.

Define \(\tilde{{\theta }}_k\) as the kth estimate, \(\forall k\), the Eq. (26) still satisfies if we replace \(\tilde{\varvec{\theta }}\) and \(\hat{\varvec{\theta }}\) as \(\tilde{{\theta }}_k\) and \(\tilde{{\theta }}_{k+1}\), respectively. Therefore, for the sufficiently large k, we have \( \hat{\varvec{\theta }}-\varvec{\theta } =\mathcal {G}^{-1} \frac{1}{\sqrt{n}} \mathcal {W}_n+\frac{1}{2}\left( \hat{\varvec{\theta }}-\varvec{\theta } \right) +o(1/\sqrt{n})\). Then

$$\begin{aligned} \hat{\varvec{\theta }}-\varvec{\theta } = \mathcal {G}^{-1} \frac{1}{\sqrt{n}}\mathcal {W}_n+o(1/\sqrt{n}). \end{aligned}$$

Combining the above result in  (24), we complete the proof of Theorem 1. \(\square \)

Lemma 5

Suppose  u is an inner point of the tight support of  \(f_{\mathcal {U}}(\cdot )\), and the conditions A.1–A.7 in appendix hold, then we have

$$\begin{aligned} \sqrt{nh} \left\{ \hat{\mathbf {g}}(u;h,\hat{\varvec{\theta }})-\mathbf {\mathbf {g}}(u)-\frac{1}{2}\mathbf {g}''(u)\mu _2h^2\right\} \mathop {\longrightarrow }\limits ^{\mathcal {L}}N(0,\Gamma _{\tau }(u)), \end{aligned}$$
(27)

where \(\Gamma _{\tau }(\cdot )\) is defined in Theorem 2.

Proof of Lemma 5

When the parameter \(\varvec{\theta }\) is known, for the given interior point \(u=\mathbf {x}^T\varvec{\theta }\) of \(\mathcal {U}\), denote \(R_{n1}^{\varvec{\theta }}(\mathbf {x})\) as \(R_{n1}^{\varvec{\theta }}\). By the similar proof as Theorem 4, the estimate of g(u) can be written as

$$\begin{aligned} \hat{\mathbf {g}}(u;h,\varvec{\theta }) =\mathbf {g}(u)+\frac{1}{2}\mathbf {g}''(u)\mu _2h^2+R_{n1}^{\varvec{\theta }}+O(h^3). \end{aligned}$$

By the central limit theorem, it is easy to prove

$$\begin{aligned} \sqrt{nh}\left( \hat{\mathbf {g}}(u;h,\varvec{\theta })-\mathbf {g}(u)- \frac{1}{2}\mathbf {g}''(u)\mu _2h^2\right) \mathop {\longrightarrow }\limits ^{\mathcal {L}} N(0,\Gamma (u)). \end{aligned}$$

By the Lemma 4, we consider the difference between the two estimate

$$\begin{aligned} \hat{\mathbf {g}}(u;h,\tilde{\varvec{\theta }})-\hat{\mathbf {g}}(u;h,\varvec{\theta })&=-\mathrm {E}(X|\mathbf {X}^T\varvec{\theta }=u)^T\varvec{\theta }_d- \mathrm {E}(Z|\mathbf {X}^T\varvec{\theta }=u)^T\\&\quad +R_{n1}^{\tilde{\varvec{\theta }}}-R_{n1}^{\varvec{\theta }}+O(\delta _{\varvec{\theta }}+h\delta _n+h^3). \end{aligned}$$

Since \(\varvec{\theta }_d=O_p(1/\sqrt{n})\), we only need to prove

$$\begin{aligned} \sqrt{nh} \left( R_{n1}^{\tilde{\varvec{\theta }}}-R_{n1}^{\varvec{\theta }} \right) =o_p(1). \end{aligned}$$
(28)

When the bandwidth h satisfies \(nh^4 \rightarrow \infty \), since \(\varvec{\theta }_d=O_p(1/\sqrt{n})\), by directly calculating, we have

$$\begin{aligned} \mathrm {Var} \left[ \sqrt{nh}\left( R_{n,1}^{\hat{\varvec{\theta }}} -R_{n,1}^{\varvec{\theta }}\right) \right]&\le (\tau -\tau ^2)\mathrm {E} \left[ K_h(\mathbf {X}^T\varvec{\theta }-u)-K_h(\mathbf {X}^T\hat{\varvec{\theta }}-u) \right] ^2\\&=(\tau -\tau ^2) \int \left( K(t)-K(t+\mathbf {X}^T\varvec{\theta }_d/h) \right) ^2 f(u+ht)\mathrm{d}t\\&\le \int \frac{1}{4} K'(t^{*})^2(\mathbf {X}^T\varvec{\theta }_d/h)^2f(u+ht)\mathrm{d}t=O\left( \frac{1}{nh^2}\right) =o(1). \end{aligned}$$

Therefore (28) holds and the proof of the Lemma 5 is completed. \(\square \)

Proof of Theorem  2

Given the interior \(\mathbf {x}\) of \(\Xi \), we have

$$\begin{aligned}&(nh)^{1/2}[\hat{\mathbf {g}}(\mathbf {x}^T\hat{\varvec{\theta }};h,\hat{\varvec{\theta }})-\mathbf {g}(\mathbf {x}^T\varvec{\theta })]\\&\quad =(nh)^{1/2}[\hat{\mathbf {g}}(\mathbf {x}^T\hat{\varvec{\theta }};h,\hat{\varvec{\theta }})- \hat{\mathbf {g}}(\mathbf {x}^T\varvec{\theta };h,\hat{\varvec{\theta }})+\hat{\mathbf {g}} (\mathbf {x}^T\varvec{\theta };h,\hat{\varvec{\theta }})-\mathbf {g}(\mathbf {x}^T\varvec{\theta })]\\&\quad =E+ (nh)^{1/2} [\hat{\mathbf {g}}(\mathbf {x}^T\varvec{\theta };h,\hat{\varvec{\theta }})-\mathbf {g}(\mathbf {x}^T\varvec{\theta })]. \end{aligned}$$

By Taylor expansion,

$$\begin{aligned} E=\sqrt{nh}[\hat{\mathbf {g}}(\mathbf {x}^T\hat{\varvec{\theta }};h,\hat{\varvec{\theta }}) -\hat{\mathbf {g}}(\mathbf {x}^T\varvec{\theta };h,\hat{\varvec{\theta }})]=\sqrt{nh}\hat{\mathbf {g}}'(\mathbf {x}^T\varvec{\theta }) O_p(\Vert \hat{\varvec{\theta }}-\varvec{\theta }\Vert )=o_p(1). \end{aligned}$$

By the result of Lemma 5, we can conclude that Theorem 2 holds. \(\square \)

Proof of Theorem 3

For convenience, redefine \(\mathbf {u}=\sqrt{n}(\hat{\varvec{\theta }}^{\lambda }-\varvec{\theta })\),  \(\hat{\varvec{\theta }}_d=\hat{\varvec{\theta }}^{QR}-\varvec{\theta }\), where \(\hat{\varvec{\theta }}^{QR}\) is the estimate of \(\varvec{\theta }\) in Theorem 1. Then, \(\mathbf {u}\) is the minimizer of the following object function:

$$\begin{aligned} G_n(\mathbf {u})&=\sum \limits _{j=1}^n \sum \limits _{i=1}^n \omega _{ij}\left( \rho _{\tau }\left( \varepsilon _i+r_{ij}+M_{ij}^T\mathbf {u}/\sqrt{n}\right) -\rho _{\tau }(\varepsilon _i+r_{ij}) \right) \\&\quad +\sum \limits _{k=1}^p \frac{\lambda }{\sqrt{n}|\hat{{\theta }}_k^{QR}|^2} \sqrt{n} \left[ \left| {\theta }_{k}+\frac{u_k}{\sqrt{n}}\right| -|{\theta }_{ k}| \right] . \end{aligned}$$

Similar to the proof of Theorem 1, we can write \(G_n(\mathbf {u})\) as:

$$\begin{aligned} G_n(\mathbf {u})&=\frac{1}{2}\mathbf {u}^T\mathcal {G} \mathbf {u}-\mathcal {W}_n^T\mathbf {u}+ \sqrt{n} \hat{\varvec{\theta }}_d^T C_0^T\mathbf {u}+o_p(1)\\&\quad + \sum \limits _{k=1}^p \frac{\lambda }{\sqrt{n}|\hat{{\theta }}_k^{QR}|^2} \sqrt{n} \left[ \left| {\theta }_k+\frac{u_k}{\sqrt{n}}\right| -|{\theta }_k| \right] . \end{aligned}$$

For \(1\le k\le p_0\),  \({\theta }_k\ne 0\), we have \(|\hat{{\theta }}_k^{QR}|^2\rightarrow _p |{\theta }_k|^2\), and \(\sqrt{n}(|{\theta }_k+u_k/\sqrt{n}|-|{\theta }_k|)\rightarrow u_k \mathrm {sgn}({\theta }_k)\). By the Slutsky’s Theorem, \(\frac{\lambda }{\sqrt{n}|\hat{{\theta }}_k^{QR}|^2} \sqrt{n}(|{\theta }_{ k}+u_k/\sqrt{n}|-|{\theta }_{ k}|)\rightarrow _p 0\).

For \(p_0<k\le p\), \({\theta }_{k}=0\), then we have \(\sqrt{n}(|{\theta }_k+u_k/\sqrt{n}|-|{\theta }_k|)\rightarrow _p\infty ~\). Therefore, we have

$$\begin{aligned} \frac{\lambda }{\sqrt{n}|\hat{{\theta }}_k^{QR}|^2} \sqrt{n} \left[ \left| {\theta }_{k}+\frac{u_k}{\sqrt{n}} \right| - |{\theta }_{k}| \right] \rightarrow _p W({\theta }_k,u_k)=\left\{ \begin{array}{ll} 0,&{}\quad \mathrm{if} \ {\theta }_k\ne 0,\\ 0,&{}\quad \mathrm{if} \ {\theta }_k =0 \ \mathrm{and} \ u_k=0,\\ \infty ,&{}\quad \mathrm{if} \ {\theta }_k=0 \ \mathrm{and} \ u_k\ne 0. \end{array}\right. \end{aligned}$$

For \(\varvec{\theta }=\left( {\begin{array}{c} \varvec{\theta }^1 \\ \varvec{\theta }^2 \end{array}} \right) \), denote \(\mathbf {u}=\left( {\begin{array}{c} \mathbf {u}_1\\ \mathbf {u}_2 \end{array}} \right) \), we have

$$\begin{aligned} G_n(\mathbf {u})&\rightarrow \frac{1}{2} \mathbf {u}^T \mathcal {G} \mathbf {u} -W_n^T\mathbf {u} + \left( \hat{\varvec{\theta }}_d^T, \hat{\varvec{\beta }}_d^T \right) C_0^T\mathbf {u} +\sum \limits _{j=1}^p W({\theta }_j,u_j)+o_p(1)\\&\rightarrow L(\mathbf {u})= \left\{ \begin{array}{ll} \frac{1}{2} \mathbf {u}^T \mathcal {G} \mathbf {u} -W_n^T\mathbf {u} + \hat{\varvec{\theta }}_d^TC_0^T\mathbf {u}, &{} \quad \mathrm{if} \ \mathbf {u}_2=0\\ \infty , &{}\quad \mathrm{otherwise}. \end{array}\right. \end{aligned}$$

Note that \(G_n(\mathbf {u})\) is convex about u, and \(L(\mathbf {u})\) has unique minimal solution. By the epi-convergence result Geyer (1994), we can obtain the asymptotic normality by following the proof of Theorem 1.

Next, we consider the convergence of the model selection. Note that the form of two formulas \(G_n(\mathbf {u})\) and \(L(\mathbf {u})\) are similar to Zou (2006), and by the condition A.8, \(\mathcal {G}\) is positive definite; hence, we can easily obtain the model consistency by following the idea of Zou (2006). \(\square \)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, W., Zhang, R., Lv, Y. et al. Quantile regression and variable selection of single-index coefficient model. Ann Inst Stat Math 69, 761–789 (2017). https://doi.org/10.1007/s10463-016-0558-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-016-0558-9

Keywords

Navigation