Appendix: Proof of theorems
Proof of Theorem 2.1
Let \(\sqrt{n}(\hat{\beta }^{WCQR_{\pi }}-\beta _{0})=\mu \) and \(\sqrt{n}(\hat{b}^{WCQR_{\pi }}_{k} -b_{k})=\nu _{k}\), and \(\theta =(\mu ,\nu )\). So, \(\theta \) is the minimizer of the following criterion:
$$\begin{aligned} L_{n}(\pi ,\theta )=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\right] . \end{aligned}$$
By Knight (1998), for any \(x\ne 0\), we have \(\rho _{\tau }(x-y)-\rho _{\tau }(x)=y[I(x<0)-\tau ]+\int _{0}^y[I(x\le t)-I(x\le 0)]dt.\) Thus, we can rewrite \(L_{n}(\pi ,\theta )\) as
$$\begin{aligned} L_{n}(\pi ,\theta )&=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}} [I(\varepsilon _{i}< b_{k})-\tau _{k}]\\ {}&\quad +\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt\\&=\sum _{k=1}^Kz_{n,k}\nu _{k}+W_{n,K}\mu +\sum _{k=1}^KB_{n,k}, \end{aligned}$$
where
$$\begin{aligned} z_{n,k}&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}[I(\varepsilon _{i}< b_{k})-\tau _{k}],\\ W_{n,K}&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}X_{i}^{T}\sum _{k=1}^K[I(\varepsilon _{i}< b_{k})-\tau _{k}],\\ B_{n,k}&=\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt. \end{aligned}$$
By the \(Cram\acute{e}r-Wald\) device and CLT, we know \(z_{n,k}\) and \(W_{n,K}\) converge in distribution to \(z_{k}\) and \(W_{1}\), where \(z_{k}\) is a normal random variable with mean \(0\), \(W_{1}\) is a \(p\)-dimensional normal random vector with \(\mathbf 0 \) and variance–covariance matrix
$$\begin{aligned} \Sigma _{1}=E\left( \frac{1}{\pi (Y)}XX^{T}\left[ \sum _{k=1}^K(I(\varepsilon <b_{k})-\tau _{k})\right] ^2\right) . \end{aligned}$$
(5.1)
where \(\pi (Y)=Pr(V=1|Y,X)=Pr(V=1|Y)\). Therefore
$$\begin{aligned} \sum _{k=1}^Kz_{n,k}\nu _{k}+W_{n,K}\mu \rightarrow _{d}\sum _{k=1}^Kz_{k}\nu _{k}+W_{1}\mu . \end{aligned}$$
Let \( B_{ni,k}=\frac{V_{i}}{\pi _{i}}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt\). For any \(\eta >0\), we have
$$\begin{aligned}{}[B_{ni,k}]^{2}=\left\{ [B_{ni,k}]^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\ge \eta \right) \right\} +\left\{ [B_{ni,k}]^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}<\eta \right) \right\} \end{aligned}$$
On the one hand, we have
$$\begin{aligned}&nE\left\{ [B_{ni,k}]^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\ge \eta \right) \right\} \\&\quad \le nE\left\{ \frac{V_{i}}{\pi _{i}^{2}}\left[ \int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}2dt\right] ^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\ge \eta \right) \right\} \\&\quad \le \frac{4}{M^2}E\left[ |v_{k}+X_{i}^{T}\mu |^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\ge \eta \right) \right] \rightarrow 0, \quad as \quad n\rightarrow \infty , \end{aligned}$$
On the other hand, we have
$$\begin{aligned}&nE\left\{ [B_{ni,k}]^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}<\eta \right) \right\} \\&\quad \le nE\left\{ \frac{2V_{i}}{\pi _{i}^2}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}dt\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt\right. \\&\quad \quad \left. \times I(\nu _{k}+\,X_{i}^{T}\mu <\sqrt{n}\eta )\right\} \\&\quad \le \frac{2n\eta }{M^2}E\left\{ \int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})dtI(\nu _{k}+X_{i}^{T}\mu <\sqrt{n}\eta )\right\} \\&\quad =\frac{2n\eta }{M^2}E_{X}\left\{ \int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[F(b_{k}+t|X)-F(b_{k}|X)]dtI(\nu _{k}+X_{i}^{T}\mu <\sqrt{n}\eta )\right\} \\&\quad =\frac{2n\eta }{M^2}E_{X}\left\{ \int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}tf(b_{k}|X)dtI(\nu _{k}+X_{i}^{T}\mu <\sqrt{n}\eta )\right\} \\&\quad \le \frac{D\eta }{M^2}E|\nu _{k}+X_{i}^{T}\mu |^2I(\nu _{k}+X_{i}^{T}\mu \!<\!\sqrt{n}\eta )\!\le \!\frac{D\eta }{M^2}E|\nu _{k}\!+\!X_{i}^{T}\mu |^2\!\rightarrow \! 0,\quad \! as \!\quad \eta \!\rightarrow \!0. \end{aligned}$$
Since \(E|\nu _{k}+X_{i}^{T}\mu |^2\) is bounded. Thus, when \( n\rightarrow \infty \), it follows that
$$\begin{aligned}&Var(B_{n,k})=\sum _{i=1}^nVar(B_{ni,k})\le nE(B_{ni,k})^2\rightarrow 0, \end{aligned}$$
Furthermore,
$$\begin{aligned} E[B_{n,k}]&=E\left[ \sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt\right] \\&=E_{X}\left[ \sum _{i=1}^n\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[F(b_{k}+t|X)-F(b_{k}|X)]dt\right] \\&=E_{X}\left[ \sum _{i=1}^n\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}tf(b_{k}|X)dt\right] +o_{p}(1)\\&=\frac{1}{2}E_{X}(f(b_{k}|X))\nu _{k}^2+\frac{1}{2}\mu ^{T}E_{X}[f(b_{k}|X)XX^{T}]\mu +o_{p}(1). \end{aligned}$$
Therefore, we get
$$\begin{aligned}&B_{n,k}\!=\!E(B_{n,k})+o_{p}(1)=\frac{1}{2}E_{X}(f(b_{k}|X))\nu _{k}^2+\frac{1}{2}\mu ^{T}E_{X}[f(b_{k}|X)XX^{T}]\mu +o_{p}(1). \end{aligned}$$
Let \(g_{k}=E_{X}(f(b_{k}|X))\) and \(C=\sum _{k=1}^KE_{X}[f(b_{k}|X)XX^{T}]\), we have
$$\begin{aligned}&L_{n}(\pi ,\theta )=\frac{1}{2}\mu ^{T}C\mu +W_{1}\mu +\frac{1}{2}\sum _{k=1}^Kg_{k}\nu _{k}^2+\sum _{k=1}^Kz_{k}\nu _{k}+o_{p}(1),\\ \end{aligned}$$
and
$$\begin{aligned}&L_{n}(\pi ,\theta )\rightarrow _{d}\frac{1}{2}\mu ^{T}C\mu +W_{1}\mu +\frac{1}{2}\sum _{k=1}^Kg_{k}\nu _{k}^2+\sum _{k=1}^Kz_{k}\nu _{k}. \end{aligned}$$
Since \(L_{n}(\pi ,\theta )\) is a convex function, following Knight (1998) and Koenker (2005), we have
$$\begin{aligned}&\mu \rightarrow _{d}N(0,C^{-1}\Sigma _{1}C^{-1}) \end{aligned}$$
where \(\Sigma _{1}\) is defined in (5.1). \(\square \)
Proof of Theorem 2.2
Let \(\sqrt{n}(\hat{\beta }^{AWCQR_{\pi }}-\beta _{0})=\mu ^{*}\), \(\sqrt{n}(\hat{b}^{AWCQR_{\pi }}_{k} -b_{k})=\nu _{k}^{*}\), and \(\theta ^{*}=(\mu ^{*},\nu ^{*})\). \(\theta ^{*}\) is the minimizer of the following criterion:
$$\begin{aligned} \mathbf L _{n}(\pi ,\theta ^{*})&=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k}^{*}+X_{i}\mu ^{*}}{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\right] \\&\ \quad +\sum _{j=1}^p\lambda _{n}\frac{(|\beta _{j}+\frac{\mu _{j}^{*}}{\sqrt{n}}|-|\beta _{j}|)}{|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}\\&=L_{n}(\pi ,\theta ^{*})+\sum _{j=1}^p\lambda _{n}\frac{(|\beta _{j}+\frac{\mu _{j}^{*}}{\sqrt{n}}|-|\beta _{j}|)}{|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}. \end{aligned}$$
Similar to the proof of Theorem 4.1 in Zou and Yuan (2008), the second term above can be expressed as
$$\begin{aligned}&\frac{\lambda _{n}}{\sqrt{n}|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}\sqrt{n}\Big [|\beta _{j}+\frac{\mu _{j}^{*}}{\sqrt{n}}|-|\beta _{j}|\Big ]\rightarrow _{p}\ \left\{ \begin{array}{l} 0,\quad if\quad \beta _{j}\ne 0,\\ 0,\quad if \quad \beta _{j}=0 \quad and\quad \mu _{j}^{*}= 0,\\ \infty ,\quad if\quad \beta _{j}=0\quad and\quad \mu _{j}^{*}\ne 0.\\ \end{array} \right. \end{aligned}$$
Let \(\mu ^{*}=(\mu _{1}^{T*},\mu _{2}^{T*})^T\) where \(\mu _{1}^{*}\) contains the first q element of \(\mu ^{*}\). Using the same arguments in Knight (1998) and Koenker (2005), we have \(\mu _{2}^{*}\rightarrow _{p}0\) and \(\mu _{1}^{*}\rightarrow _{d}N(0,[C^{-1}\Sigma _{1}C^{-1}]_{\Lambda \Lambda }).\) Thus, asymptotic normality is proven.
Next, we prove the consistency part. Let \(\hat{\Lambda }_{n}=\{j:\hat{\beta }^{AWCQR_{\pi }}_{j}\ne 0\}\) and \(\Lambda =\{j:\beta _{j}\ne 0\}\), \(\forall j\in \Lambda ,\) the asymptotic normality indicates \(P(j\in \hat{\Lambda }_{n})\rightarrow 1.\) It suffices to show \(\forall j\notin \Lambda ,P(j\in \hat{\Lambda }_{n})\rightarrow 0.\) Note that,
$$\begin{aligned}&(\hat{b}^{AWCQR_{\pi }}_{1},\ldots ,\hat{b}^{AWCQR_{\pi }}_{K},\hat{\beta }^{AWCQR_{\pi }})\\&\quad =\mathop {\mathrm {argmin}}_{\begin{array}{c} b_1,\ldots ,b_K,\beta \end{array}} \sum _{k=1}^K \sum _{i=1}^n \frac{V_{i}}{\pi _{i}}\rho _{\tau _{k}}(Y_{i}-X_{i}^{T}\beta -b_{k}) +\sum _{j=1}^p\lambda _{n}\frac{\mid \beta _{j}\mid }{\mid \hat{\beta }^{WCQR_{\pi }}_{j}\mid ^2}. \end{aligned}$$
If \(j\in \hat{\Lambda }_{n}\), then we must have
$$\begin{aligned}&\sum _{k=1}^K \sum _{i=1}^n \frac{V_{i}}{\pi _{i}}\rho _{\tau _{k}}\left( Y_{i}-X_{ij}\hat{\beta }^{AWCQR_{\pi }}_{j}-\hat{b}^{AWCQR_{\pi }}_{k}\right) +\lambda _{n}\frac{\mid \hat{\beta }^{AWCQR_{\pi }}_{j}\mid }{\mid \hat{\beta }^{WCQR_{\pi }}_{j}\mid ^2}\\&\quad \le \sum _{k=1}^K \sum _{i=1}^n \frac{V_{i}}{\pi _{i}}\rho _{\tau _{k}}\left( Y_{i}-\hat{b}^{AWCQR_{\pi }}_{k}\right) . \end{aligned}$$
Using the fact that \(|\frac{\rho _{\tau }(x_{1})-\rho _{\tau }(x_{2})}{x_{1}-x_{2}}|\le \) max\((\tau ,1-\tau )<1,\) we have \(\frac{\lambda _{n}}{|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}<\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}|X_{ij}|=K.\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}|X_{ij}|.\) So \(P(j\in \hat{\Lambda }_{n})\le P\big (\frac{\lambda _{n}}{|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}<K.\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}|X_{ij}|\big )\rightarrow 0.\) This completes the proof of Theorem 2.2. \(\square \)
Proof of Theorem 2.3
Let \(\sqrt{n}(\hat{\beta }^{WCQR_{\hat{\pi }}}-\beta _{0})=\mu ^{**}\), \(\sqrt{n}(\hat{b}^{WCQR_{\hat{\pi }}}_{k} -b_{k})=\nu _{k} ^{**}\), and \(\theta ^{**}=(\mu ^{**},\nu ^{**})\). \(\theta ^{**}\) is the minimizer of the following criterion:
$$\begin{aligned} L_{n}(\widehat{\pi },\theta ^{**})=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\widehat{\pi _{i}}}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\right] , \end{aligned}$$
Let \(Q_{n}(\pi ,\theta ^{**})=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) \right] .\)
$$\begin{aligned} L_{n}(\widehat{\pi },\theta ^{**})&=[Q_{n}(\pi ,\theta ^{**})-Q_{n}(\pi ,0)]+[Q_{n}(\widehat{\pi },\theta ^{**})-Q_{n}(\pi ,\theta ^{**})]\\&\quad \ -[Q_{n}(\widehat{\pi },0)-Q_{n}(\pi ,0)]\\&=I_{1}+I_{2}-I_{3}, \end{aligned}$$
where
$$\begin{aligned}&I_{1}=L_{n}(\pi ,\theta ^{**}),\\&I_{2}=\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) \left( \frac{1}{\widehat{\pi _{i}}}-\frac{1}{\pi _{i}}\right) \right] ,\\&I_{3}=\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\left( \frac{1}{\widehat{\pi _{i}}}-\frac{1}{\pi _{i}}\right) \right] . \end{aligned}$$
Considering the fact \(max_{1\le i \le n}|\hat{\pi _{i}}-\pi _{i}|=O_{p}(h^q+\sqrt{\log n/nh})=O_{p}(d_{n})\), we can see that
$$\begin{aligned} I_{2}\!-\!I_{3}&=\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \!\rho _{\tau _{k}}\left( \!\varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\!\right] \left[ \left( \frac{1}{\widehat{\pi _{i}}}-\frac{1}{\pi _{i}}\right) \!\right] \\&=-\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \!\rho _{\tau _{k}}\left( \!\varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) \!-\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\!\right] \left[ \!\frac{(\hat{\pi _{i}}-\pi _{i})}{\pi _{i}^2}\!\right] \\&\quad \ +O_{p}(\sqrt{n}d_{n}^2). \end{aligned}$$
Recall the definition of \(\widehat{\pi }(\cdot )\), we have
$$\begin{aligned} \widehat{\pi }(y)-\pi (y)&=\frac{\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-y)}{\sum _{j=1}^nL_{h}(Y_{j}-y)}+\frac{\sum _{j=1}^n(\pi _{j}-\pi (y))L_{h}(Y_{j}-y)}{\sum _{j=1}^nL_{h}(Y_{j}-y)}\\&=\frac{1}{nhf_{Y}(y)}\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-y)+O_{p}(h^q)+o_{p}(n^{-1/2}). \end{aligned}$$
Therefore,
$$\begin{aligned} I_{2}\!-\!I_{3}&= -\sum _{k=1}^K\sum _{i=1}^nV_{i}\Bigg [\rho _{\tau _{k}}\Big (\varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\Big )\nonumber \\&-\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\Bigg ]\Bigg [\frac{\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}+O_{p}(h^q)\nonumber \\&+o_{p}\left( n^{-\frac{1}{2}}\right) \Bigg ]+O_{p}(\sqrt{n}d_{n}^2)\nonumber \\&= -\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\right] \nonumber \\&\frac{\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}+o_{p}(1)\nonumber \\&= -\sum _{k=1}^K\sum _{i=1}^nV_{i}\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}[I(\varepsilon _{i}<b_{k})-\tau _{k}]\nonumber \\&\frac{\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}+o_{p}(1)\nonumber \\&= -\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^nV_{i}\frac{X_{i}^{T}\mu ^{**}}{\sqrt{n}}[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}\nonumber \\&-\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^nV_{i}\frac{\nu _{k} ^{**}}{\sqrt{n}}[I(\varepsilon _{i}\!<\!b_{k})\!-\!\tau _{k}]\frac{(V_{j}\!-\!\pi _{j})L_{h}(Y_{j}\!-\!Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}\!+\!o_{p}(1).\qquad \quad \ \end{aligned}$$
(5.2)
Furthermore,
$$\begin{aligned}&\sum _{k=1}^K\sum _{j=1}^n\sum _{i=1}^nV_{i}[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}X_{i}^{T}\nonumber \\&\quad \ =\sum _{i=1}^n\sum _{j=1}^n(V_{i}-\pi _{i})\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}X_{i}^{T}\nonumber \\&\ \qquad +\sum _{i=1}^n\sum _{j=1}^n\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}}X_{i}^{T}. \end{aligned}$$
(5.3)
The first term of (5.3) can be rewritten as
$$\begin{aligned}&\sum _{i=1}^n\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{i}-\pi _{i})^2L(0)}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}X_{i}^{T} +\sum _{i\ne j}(V_{i}-\pi _{i})\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\\&\ \ \qquad \frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}X_{i}^{T}\\&\quad =O_{p}\left( \frac{1}{\sqrt{n}h}\right) +O_{p}\left( \frac{1}{\sqrt{nh}}\right) =o_{p}(1). \end{aligned}$$
The second term of (5.3) is
$$\begin{aligned}&\sum _{i=1}^n\sum _{j=1}^n\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}}X_{i}^{T}\\&\quad \ =\frac{1}{\sqrt{n}}\sum _{j=1}^n\frac{(V_{j}-\pi _{j})}{\pi _{j}}E\left[ X_{j}^{T}\sum _{k=1}^K(I(\varepsilon _{j}<b_{k})-\tau _{k})|Y_{j}\right] +o_{p}(1). \end{aligned}$$
In additon, the second term of (5.2) can be rewritten as
$$\begin{aligned}&\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^nV_{i}[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}\nu _{k} ^{**}\\&\quad \ =\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^n(V_{i}-\pi _{i})[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}\nu _{k} ^{**}\\&\quad \ \quad +\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^n[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}}\nu _{k} ^{**}\\&\quad \ =\sum _{k=1}^K\frac{1}{\sqrt{n}}\sum _{j=1}^n\frac{(V_{j}-\pi _{j})}{\pi _{j}}E[(I(\varepsilon _{j}<b_{k})-\tau _{k})|Y_{j}]\nu _{k} ^{**}+o_{p}(1). \end{aligned}$$
So, we have
$$\begin{aligned} L_{n}(\widehat{\pi },\theta ^{**})&=\frac{1}{2}\mu ^{**T}\left\{ \sum _{k=1}^KE_{X}[f(b_{k}|X)XX^{T}]\right\} \mu ^{**}+\Psi _{nk}\mu ^{**}+\frac{1}{2}\sum _{k=1}^Kg_{k}\nu _{k} ^{**2}\\&\quad \ +\sum _{k=1}^K\Phi _{nk}\nu _{k} ^{**}+o_{p}(1), \end{aligned}$$
where
$$\begin{aligned} \Psi _{nk}&=\left\{ \frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}X_{i}^{T}\sum _{k=1}^K[I(\varepsilon _{i}\le b_{k})-\tau _{k}]\right. \\&\qquad \ \left. -\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{(V_{i}-\pi _{i})}{\pi _{i}}E\left[ X_{i}^{T}\sum _{k=1}^K(I(\varepsilon _{i}<b_{k})-\tau _{k})|Y_{i}\right] \right\} ,\\ \Phi _{nk}&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}[I(\varepsilon _{i}\le b_{k})-\tau _{k}]-\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{(V_{i}-\pi _{i})}{\pi _{i}}E[(I(\varepsilon _{i}<b_{k})-\tau _{k})|Y_{i}]. \end{aligned}$$
By the central limit theorem \(\Phi _{nk}\) converge in distribution to \(z_{k}{^\prime }\), where \(z_{k}{^\prime }\) is a normal distribution with mean 0. \(\Psi _{nk}\) converge in distribution to \(W_{2}\), where \(W_{2}\) is a normal distribution with mean \(\mathbf 0 \) and covariance \(\Sigma _{2}.\)
Here
$$\begin{aligned} \Sigma _{2}&=cov\left( \frac{V_{i}}{\pi _{i}}X_{i}^{T}\sum _{k=1}^K[I(\varepsilon _{i}\le b_{k})-\tau _{k}])\right. \nonumber \\&\quad \ \left. -\frac{(V_{i}-\pi _{i})}{\pi _{i}}E\left[ X_{i}^{T}\sum _{k=1}^K(I(\varepsilon _{i}<b_{k})-\tau _{k})|Y_{i}\right] \right) \nonumber \\&=E\left( \frac{1}{\pi (Y)}XX^{T}\left[ \sum _{k=1}^K(I(\varepsilon <b_{k})-\tau _{k})\right] ^2\right) \nonumber \\&\quad \ -E\left( \frac{1-\pi (Y)}{\pi (Y)}E\left[ X\sum _{k=1}^K(I(\varepsilon <b_{k})-\tau _{k})|Y\right] ^{\bigotimes 2}\right) . \end{aligned}$$
(5.4)
Hence, we have
$$\begin{aligned}&L_{n}(\widehat{\pi },\theta ^{**})\rightarrow _{d}\frac{1}{2}\mu ^{**T}C\mu ^{**}+W_{2}^{T}\mu ^{**}+\frac{1}{2}\sum _{k=1}^Kg_{k}\nu _{k} ^{**2}+\sum _{k=1}^K+z^{'}_{k}\nu _{k} ^{**}. \end{aligned}$$
By the same way to Knight (1998) and Koenker (2005), we have
$$\begin{aligned} \mu ^{**}\rightarrow _{d}N(0,C^{-1}\Sigma _{2}C^{-1}), \end{aligned}$$
where \(\Sigma _{2}\) is defined in (5.4). Theorem is as claimed. \(\square \)
Proof of Theorem 2.4
Since the proof is similar to Theorem 2.2, we omit it here. \(\square \)