Skip to main content
Log in

Jackknife model averaging for mixed-data kernel-weighted spline quantile regressions

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

In the past two decades, model averaging has attracted more and more attention and is regarded as a much better tool to solve model uncertainty than model selection. Compared with the conditional mean regression, the quantile regression serves as a robust alternative and shows a lot more information about the conditional distribution of a response variable. In this paper, we propose a jackknife model averaging procedure that chooses the weights by minimizing a leave-one-out cross-validation criterion function for mixed-data kernel-weighted spline quantile regressions that contain both continuous and categorical regressors when all candidate models are potentially misspecified. We demonstrate the JMA estimator is asymptotically optimal in terms of minimizing the out-of-sample final prediction error. Simulation experiments are conducted to assess the relative finite-sample performance of the proposed JMA method with respect to other model selection and averaging methods. Our JMA method is applied to the wage and house datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

Download references

Acknowledgements

This work was supported by Grants from National Natural Science Foundation of China (Nos. 11731012, 12031005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianwen Sun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A The proof of main theorems

Appendix A The proof of main theorems

In the following proofs, Knight’s (1998) identity will be used repeatedly:

$$\begin{aligned} \rho _\tau (u+v) - \rho _\tau (u) = v\psi _\tau (u) + \int _{0}^{-v}[\varvec{1}(u\le s) - \varvec{1}(u\le 0)]ds, \end{aligned}$$

where \(\psi _\tau (u) = \tau - \varvec{1}(u\le 0)\).

1.1 P1. Proof of Theorem 1

Proof

(i) Let \(\delta _n = \sqrt{K_{(m)}/n}\), and \(\varvec{v}_{(m)} \in R^{K_{(m)}}\)such that \(\left\| \varvec{v}_{(m)}\right\| = c\), where \(c > 0\) is a sufficiently large constant. Let

$$\begin{aligned} L_{n(m)}(\varvec{\beta }_{(m)}) = \sum _{i=1}^{n} \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }_{(m)}\right) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}), \end{aligned}$$

and we just need to show that for any given \(\epsilon > 0\) there exists a large constant c such that, for large enough n, we have

$$\begin{aligned} P\left[ \inf _{\left\| \varvec{v}_{(m)}\right\| = c} L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}\right) > L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \right] \ge 1- \epsilon , \end{aligned}$$

which implies with probability approaching 1 there exists a local min- imum \(\hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)})\) in the ball \(\left\{ \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}: \left\| \varvec{v}_{(m)}\right\| \le c\right\} \) such that \(\left\| \hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)}) -\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}) \right\| = O_p(\delta _n)\). According to the convexity of \(L_{n(m)}\), it is also the global minimum.

Let \(a_{i(m)} = \mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\). Then by knight’s identity, we have

$$\begin{aligned}&L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}\right) - L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \nonumber \\&\quad ={} \sum _{i=1}^{n} \left\{ \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}\right) \right) \right. \nonumber \\&\qquad \left. - \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \right\} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \nonumber \\&\quad ={} \sum _{i=1}^{n} \left\{ \rho _{\tau }\left( \epsilon _i + a_{i(m)} -\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}\right) - \rho _{\tau }\left( \epsilon _i + a_{i(m)}\right) \right\} \nonumber \\ {}&\times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \nonumber \\&\quad ={} \sum _{i=1}^{n} \left\{ -\delta _n\psi _\tau (\epsilon _i + a_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)} + \int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds \right\} \nonumber \\&\qquad \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \nonumber \\&\quad ={} -\delta _n\sum _{i=1}^{n} \psi _\tau (\epsilon _i + a_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \nonumber \\&\qquad + \sum _{i=1}^{n} E\left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds \right] \nonumber \\&\qquad + \sum _{i=1}^{n} \left\{ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds\right. \nonumber \\&\qquad \left. -E\left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds \right] \right\} \nonumber \\&\quad \triangleq {} A_{n{m},1}(\varvec{v}_{(m)}) + A_{n{m},2}(\varvec{v}_{(m)}) + A_{n{m},3}(\varvec{v}_{(m)}), \end{aligned}$$
(A1)

where \(\alpha _{i(m)}(s) = \varvec{1}(\epsilon _i + a_{i(m)}\le s) - \varvec{1}(\epsilon _i + a_{i(m)}\le 0)\).

By the first order condition for the population minimization problem (10), we have

$$\begin{aligned} E\left[ \psi _\tau (\epsilon _i + a_{i(m)})\varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)},\varvec{\lambda }_{(m)})\right] = 0. \end{aligned}$$
(A2)

By the condition (C7),

$$\begin{aligned} \begin{aligned}&E|A_{n{m},1}(\varvec{v}_{(m)}) |^2\\&\quad \le {} \delta _n^2\sum _{i=1}^{n}\varvec{v}_{(m)}^TE\left[ \psi _\tau ^2(\epsilon _i + a_{i(m)})\varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)},\varvec{\lambda }_{(m)})\right] \varvec{v}_{(m)}^T \\&\quad \le {} \bar{c}_{B(m)}n\delta _n^2\left\| \varvec{v}_{(m)} \right\| ^2 \end{aligned} \end{aligned}$$

and thus by Chebyshev’s inequality, we have

$$\begin{aligned} A_{n{m},1}(\varvec{v}_{(m)}) = O_p(\bar{c}_{B(m)}^{1/2}n^{1/2}\delta _n)\left\| \varvec{v}_{(m)} \right\| . \end{aligned}$$
(A3)

For \(A_{n{m},2}(\varvec{v}_{(m)})\), according to the law of iterated expectation, Taylor expansion and condition (C6), we have

$$\begin{aligned}&A_{n{m},2}(\varvec{v}_{(m)}) \nonumber \\&\quad ={}\sum _{i=1}^{n} E\left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}F(-a_{i(m)}+s |\varvec{X}_i,\varvec{Z}_i)\right. \nonumber \\&\qquad \left. - F(-a_{i(m)} |\varvec{X}_i,\varvec{Z}_i) ds \right] \nonumber \\&\quad ={}\sum _{i=1}^{n} E\left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}} f(-a_{i(m)} |\varvec{X}_i,\varvec{Z}_i)s ds \right] (1+o_p(1)) \nonumber \\&\quad ={}\frac{1}{2}\delta _n^2\varvec{v}_{(m)}^T\left( \sum _{i=1}^{n} E\left[ f(-a_{i(m)} |\varvec{X}_i,\varvec{Z}_i)L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \times \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \right] \right) \varvec{v}_{(m)}(1+o_p(1)) \nonumber \\&\quad ={}\frac{1}{2}\delta _n^2n\varvec{v}_{(m)}^TA_{(m)} \varvec{v}_{(m)}(1+o_p(1)) \ge \frac{\underline{c}_{A(m)}}{2}\delta _n^2n\left\| \varvec{v}_{(m)} \right\| ^2 \end{aligned}$$
(A4)

with probability approaching 1. It can be seen that \(E \left( A_{n{m},3}(\varvec{v}_{(m)})\right) = 0\), and by conditions (C7), we have

$$\begin{aligned} \begin{aligned}&Var \left( A_{n{m},3}(\varvec{v}_{(m)})\right) \\&\quad \le {}\sum _{i=1}^{n}E\left\{ L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\left[ \int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds\right] ^2\right\} \\&\quad \le {}\sum _{i=1}^{n}E \left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)} \right] ^2 \\&\quad = {} n\delta _{n}^2\varvec{v}_{(m)}^TE \left[ L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \right] \varvec{v}_{(m)} \le \bar{c}_{B(m)}n\delta _{n}^2\left\| \varvec{v}_{(m)}\right\| ^2. \end{aligned} \end{aligned}$$

Thus we obtain

$$\begin{aligned} A_{n{m},3}(\varvec{v}_{(m)}) = O_p(\bar{c}_{B(m)}^{1/2}n^{1/2}\delta _{n})\left\| \varvec{v}_{(m)}\right\| . \end{aligned}$$
(A5)

By (A3)–(A5), and allowing \(\left\| \varvec{v}_{(m)} \right\| \) to be large enough, both \(A_{n{m},1}(\varvec{v}_{(m)})\) and \(A_{n{m},3}(\varvec{v}_{(m)})\) are dominated by \(A_{n{m},2}(\varvec{v}_{(m)})\) with probability approaching 1. Thus combing with (A1), we have

$$\begin{aligned} L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}\right) - L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) > 0 \end{aligned}$$

with probability approaching 1. This proves (i).

(ii) Let \(\hat{\Delta }_{(m)} = \sqrt{n}\left( \hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)}) - \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \) and \(\Delta _{(m)} = \sqrt{n}\left( \varvec{\beta }_{(m)} - \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \). It can be seen that

$$\begin{aligned} \hat{\Delta }_{(m)}= & {} \arg \min _{\Delta _{(m)}}\sum _{i=1}^{n}\rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left[ \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}) + n^{-1/2}\Delta _{(m)}\right] \right) \\{} & {} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}). \end{aligned}$$

Let \(V_{(m)}(\Delta ) = n^{-1/2}\sum _{i=1}^{n}\psi _\tau \left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left[ \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}) + n^{-1/2}\Delta \right] \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\) and \(\bar{V}_{(m)}(\Delta ) = E\left( V_{(m)}(\Delta ) \right) \). Define the weighted norm \(\left\| \cdot \right\| _{\varvec{d}_{(m)}}\) by

$$\begin{aligned} \left\| A \right\| _{\varvec{d}_{(m)}} = \left\| \varvec{d}_{(m)}^TA \right\| , \end{aligned}$$

where \(\varvec{d}_{(m)}\) is a \(K_{(m)} \times 1\) vector with \(\left\| \varvec{d}_{(m)} \right\| \le \underline{c}_{B(m)}^{-1/2}\).

We need to show for any large positive constant \(L < \infty \),

$$\begin{aligned}{} & {} \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}(\Delta ) - V_{(m)}(0) - \bar{V}_{(m)}(\Delta )+ \bar{V}_{(m)}(0) \right\| _{\varvec{d}_{(m)}} = o_p(1), \end{aligned}$$
(A6)
$$\begin{aligned}{} & {} \qquad \quad \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| \bar{V}_{(m)}(\Delta )- \bar{V}_{(m)}(0) + A_{(m)}\Delta \right\| _{\varvec{d}_{(m)}} = o_p(1), \end{aligned}$$
(A7)
$$\begin{aligned}{} & {} \left\| V_{(m)}(\hat{\Delta }_{(m)})\right\| _{\varvec{d}_{(m)}} = o_p(1). \end{aligned}$$
(A8)

By (A6)–(A7) and the result of part(i), we have \(\left\| V_{(m)}(\hat{\Delta }_{(m)}){-} V_{(m)}(0) {+} A_{(m)}\hat{\Delta }_{(m)} \right\| _{\varvec{d}_{(m)}} = o_p(1)\), Thus by conditions (C6)–(C7), we obtain \(\hat{\Delta }_{(m)}= A_{(m)}^{-1}V_{(m)}(0)-A_{(m)}^{-1}V_{(m)}(\hat{\Delta }_{(m)}) + A_{(m)}^{-1}R_{(m)}\), and

$$\begin{aligned} \begin{aligned}&\varvec{D}_{(m)}^TC_{(m)}^{-1/2}\hat{\Delta }_{(m)}=\sqrt{n}\varvec{D}_{(m)}^TC_{(m)}^{-1/2}\left( \hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)}) - \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \\&\quad ={}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}V_{(m)}(0)-\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}V_{(m)}(\hat{\Delta }_{(m)}) +\varvec{D}^T_{(m)}C_{(m)}^{-1/2} A_{(m)}^{-1}R_{(m)}\\&\quad \triangleq {} \mathbb {P}_{(m)1} -\mathbb {P}_{(m)2} +\mathbb {P}_{(m)3} \end{aligned} \end{aligned}$$

where \(\left\| R_{(m)}\right\| _{\varvec{d}_{(m)}} = o_p(1)\) for any \(\varvec{d}_{(m)}\) with \(\left\| \varvec{d}_{(m)} \right\| \le \underline{c}_{B(m)}^{-1/2}\), and \(C_{(m)}^{1/2}\) represents the symmetric square root of \(C_{(m)}\) and \(C_{(m)}^{-1/2}\) denotes the inverse of \(C_{(m)}^{1/2}\).

Let \(\zeta _{ni} = n^{-1/2}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}\psi _\tau \left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\), then \(\mathbb {P}_{(m)1} = \sum _{i=1}^{n}\zeta _{ni}\). By (A2), we have \(E(\zeta _{ni}) = 0\) and \(E(\mathbb {P}_{(m)1}) = 0\). Thus

$$\begin{aligned}{} & {} Var(\mathbb {P}_{(m)1}) = \sum _{i=1}^{n} Var(\zeta _{ni})\\{} & {} \quad ={}n^{-1}\sum _{i=1}^{n}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1} E\left[ \psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})\right. \\{} & {} \qquad \left. \times L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right] A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\\{} & {} \quad ={}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}B_{(m)} A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\\{} & {} \quad ={}\varvec{D}^T_{(m)}\varvec{D}_{(m)} = 1. \end{aligned}$$

By the fact that \(tr(AB)\le \lambda _{max}(A)tr(B)\) for symmetric matrix A and positive semi-definite matrix B, where \(tr(\cdot )\) denotes the trace of some matrix, we have

$$\begin{aligned}&E\left\| \zeta _{ni} \right\| ^4\nonumber \\&\quad ={}n^{-2}E\left\{ \left[ tr\left( \varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}\psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})\right. \right. \right. \nonumber \\&\qquad \left. \left. \left. \times L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)} \right) \right] ^2 \right\} \nonumber \\&\quad \le {}n^{-2}E\left\{ \left[ tr\left( \psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right. \right. \right. \nonumber \\&\qquad \left. \left. \left. \times A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1} \right) \right] ^2 \right\} \nonumber \\&\quad \le {}n^{-2}E\left\{ tr^2\left( \psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right) \right. \nonumber \\&\qquad \left. \times \left[ \lambda _{max}\left( A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}\right) \right] ^2 \right\} \nonumber \\&\quad \le {}n^{-2}E\left\{ tr^2\left( \psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right) \right. \nonumber \\&\qquad \left. \times \left[ \lambda _{max}\left( A_{(m)}^{-1}C_{(m)}^{-1}A_{(m)}^{-1}\right) \right] ^2\left[ \lambda _{max}\left( \varvec{D}_{(m)}\varvec{D}^T_{(m)}\right) \right] ^2 \right\} \nonumber \\&\quad \le {}n^{-2}E\left\{ tr^2\left( \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})\right) \left[ \lambda _{max}\left( B_{(m)}^{-1}\right) \right] ^2 \right\} \nonumber \\&\quad ={}O(n^{-2}K_{(m)}^2\underline{c}_{B(m)}^{-2}). \end{aligned}$$
(A9)

Then for any \(\epsilon > 0\), we have

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^{n}E\left[ \left\| \zeta _{ni} \right\| ^2\varvec{1}(\left\| \zeta _{ni} \right\| \ge \epsilon ) \right] \\&\qquad \quad \le {} n\epsilon ^{-2}E\left\| \zeta _{ni} \right\| ^4 = O(n^{-1}K_{(m)}^2\underline{c}_{B(m)}^{-2}) = o(1) \end{aligned} \end{aligned}$$

by condition (C9). Therefore, \(\left\{ \zeta _{ni} \right\} \) satisfies the conditions of the Lindeberg–Feller central limit theorem and we obtain

$$\begin{aligned} \mathbb {P}_{(m)1} {\mathop {\longrightarrow }\limits ^{d}} N(0, 1). \end{aligned}$$
(A10)

Let \(\varvec{d}_{(m)} = A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\), and by the facts that \( |\varvec{r}^T(\varvec{A}^T + \varvec{A})\varvec{r} |\le \lambda _{max}(\varvec{A}^T + \varvec{A})\left\| \varvec{r} \right\| ^2\) and \(\lambda _{max}(\varvec{A}^T\varvec{A}) = \lambda _{max}(\varvec{A}\varvec{A}^T)\) for any vector \(\varvec{r}\) and square matrix \(\varvec{A}\), we have

$$\begin{aligned} \begin{aligned} \left\| \varvec{d}_{(m)}\right\| ={}&\left\| \varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1} \right\| \\ ={}&\left[ \varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1} A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\right] ^{1/2} \\ \le {}&\left[ \lambda _{max}\left( C_{(m)}^{-1/2}A_{(m)}^{-1} A_{(m)}^{-1}C_{(m)}^{-1/2}\right) \left\| \varvec{D}_{(m)}\right\| ^2 \right] ^{1/2} \\ ={}&\lambda _{max}^{1/2} \left( A_{(m)}^{-1}C_{(m)}^{-1}A_{(m)}^{-1}\right) \\ ={}&\lambda _{max}^{1/2} \left( B_{(m)}^{-1}\right) \le \underline{c}_{B(m)}^{-1/2}. \end{aligned} \end{aligned}$$

Thus for \(\mathbb {P}_{(m)2} \) and \(\mathbb {P}_{(m)3}\), by (A8)

$$\begin{aligned} \left\| \mathbb {P}_{(m)2}\right\| = \left\| V_{(m)}(\hat{\Delta }_{(m)}) \right\| _{A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}} = o_p(1), \end{aligned}$$
(A11)

and

$$\begin{aligned} \left\| \mathbb {P}_{(m)3}\right\| = \left\| R_{(m)}\right\| _{A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}} = o_p(1). \end{aligned}$$
(A12)

Next, we show that (A6)–(A8) hold under conditions (C1)–(C9). For (A6), write \(b_{i(m)} \equiv \varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)}) = b_{i(m)}^{+} - b_{i(m)}^{-}\), where \(b_{i(m)}^{+} = \max (b_{i(m)},0)\) and \(b_{i(m)}^{-} = \max (-b_{i(m)},0)\). Thus by Minkowski’s inequality, we obtain

$$\begin{aligned}&\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}(\Delta ) - V_{(m)}(0) - \bar{V}_{(m)}(\Delta )+ \bar{V}_{(m)}(0) \right\| _{\varvec{d}_{(m)}}\nonumber \\&\quad \le {} \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}^+(\Delta ) - V_{(m)}^+(0) - \bar{V}_{(m)}^+(\Delta )+ \bar{V}_{(m)}^+(0) \right\| \nonumber \\&\qquad + \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}^-(\Delta ) - V_{(m)}^-(0) - \bar{V}_{(m)}^-(\Delta )+ \bar{V}_{(m)}^-(0) \right\| , \end{aligned}$$
(A13)

where \(V_{(m)}^+(\Delta ) = n^{-1/2}\sum _{i=1}^{n}\psi _\tau \left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left[ \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}) + n^{-1/2}\Delta \right] \right) \)

\(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\) and \(\bar{V}_{(m)}^+(0) = EV_{(m)}^+(\Delta )\), and \(V_{(m)}^-(\Delta )\) and \(\bar{V}_{(m)}^-(\Delta ) \) are similarly defined. We need to show that each term on the right side of (A13) is \(o_p(1)\). Since the proofs of both terms are similar, we only show the first term is \(o_p(1)\).

Let \(\mathcal {H} \triangleq \left\{ \Delta \in R^{K_{(m)}}: \left\| \Delta \right\| \le K_{(m)}^{1/2}L \right\} \) for some positive \(L < \infty \). Let \( |\Delta |_{\infty }\) represent the maximum of the absolute values of the coordinates of \(\Delta \). By selecting \(N_g = (2n^2)^{K_{(m)}}\) grid points, \(\left\{ \Delta _1,\ldots , \Delta _{N_g}\right\} \), \(\mathcal {H}\) can be covered by cubes \(\mathcal {H}_s = \left\{ \Delta \in R^{K_{(m)}}: |\Delta -\Delta _s |_{\infty }\le \gamma _n \right\} \) with sides of length \(\gamma _n = LK_{(m)}^{1/2}/n^2\).

Let \(\epsilon _{i(m)}^* = Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\). By the fact that \(\psi _\tau (\cdot )\) is monotone and by Minkowski’s inequality, we can show that

$$\begin{aligned} \begin{aligned}&\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}^+(\Delta ) - V_{(m)}^+(0) - \bar{V}_{(m)}^+(\Delta )+ \bar{V}_{(m)}^+(0)\right\| \\&\quad \le {} \max _{1\le s \le N_g} \left\| V_{(m)}^+(\Delta _s) - V_{(m)}^+(0) - \bar{V}_{(m)}^+(\Delta _s)+ \bar{V}_{(m)}^+(0)\right\| \\&\qquad + \max _{1\le s \le N_g}\left\| n^{-1/2}\sum _{i=1}^{n}\left\{ E\left[ \psi _{si(m)}(\gamma _n)L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \right. \right. \\&\qquad \left. \left. -E\left[ \psi _{si(m)}(-\gamma _n)L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \right\} \right\| \\&\qquad + \max _{1\le s \le N_g}\left\| n^{-1/2}\sum _{i=1}^{n}\left\{ \left[ \left( \psi _{si(m)}(\gamma _n) - \psi _{si(m)}(0)\right) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \right. \right. \\&\qquad \left. \left. -E\left[ \left( \psi _{si(m)}(\gamma _n) - \psi _{si(m)}(0)\right) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \right\} \right\| \\&\quad \triangleq {} \Omega _{(m)1} + \Omega _{(m)2} + \Omega _{(m)3}, \end{aligned} \end{aligned}$$

where \(\psi _{si(m)}(\gamma ) = \psi _\tau \left( \epsilon _{i(m)}^* - n^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta +n^{-1/2}\gamma \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| \right) \).

For \(\Omega _{(m)2}\), conditions (C5) and (C9), Taylor expansion and the fact \(b_{i(m)}^+\le |\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)}) |\le \left\| \varvec{d}_{(m)} \right\| \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| \) are used to obtain

$$\begin{aligned} \begin{aligned} \Omega _{(m)2} ={}&\max _{1\le s \le N_g}\left\| n^{1/2}\left\{ E\left[ F(-a_{i(m)}+n^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta _s+n^{-1/2}\gamma _n\right. \right. \right. \\&\left. \left. \left. \times \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| ) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ |\varvec{X}_i, \varvec{Z}_i\right] \right. \right. \\&\left. \left. -E\left[ F(-a_{i(m)}+n^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta _s-n^{-1/2}\gamma _n\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| )\right. \right. \right. \\&\left. \left. \left. \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ |\varvec{X}_i, \varvec{Z}_i\right] \right\} \right\| \\ \le {}&2c_f\gamma _nE\left[ \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \\ \le {}&2c_f\gamma _n\left\| \varvec{d}_{(m)} \right\| E\left[ \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| ^2 L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right] \\ ={}&2c_f\gamma _n\left\| \varvec{d}_{(m)} \right\| E\left\{ tr\left[ \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right] \right\} \\ \le {}&2c_f\gamma _n\left\| \varvec{d}_{(m)} \right\| K_{(m)}\frac{\bar{c}_{A(m)}}{c_f} = O(\underline{c}_{B(m)}^{-1/2}\bar{c}_{A(m)}K_{(m)}^{3/2}/n^2) = o(1). \end{aligned} \end{aligned}$$

For \(\Omega _{(m)1}\), we can see that

$$\begin{aligned}&V_{(m)}^+(\Delta _s) - V_{(m)}^+(0) - \bar{V}_{(m)}^+(\Delta _s)+ \bar{V}_{(m)}^+(0)\\&\quad ={}n^{-1}\sum _{i=1}^{n}\eta _{is(m)}\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\le h_{n0})\\&\qquad +n^{-1}\sum _{i=1}^{n}\eta _{is(m)}\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+> h_{n0})\\&\quad \triangleq {} \Omega _{(m)1,1} + \Omega _{(m)1,2}, \end{aligned}$$

where \(\eta _{is(m)} = n^{1/2}[\eta _{is(m),0} - E(\eta _{is(m),0})]\),

\(\eta _{is(m),0} {=} \left[ \psi _\tau \left( \epsilon _{i(m)}^* -n^{-1/2} \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta _s\right) {-}\psi _\tau \left( \epsilon _{i(m)}^* \right) \right] L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\), and \(h_{n0} = (nK_{(m)}^4\underline{c}_{B(m)}^{-4})^{1/8}\). We just need to prove \(\max _{1\le s \le N_g} \left\| \Omega _{(m)1,1} \right\| = o_p(1)\) and \(\max _{1\le s \le N_g} \left\| \Omega _{(m)1,2} \right\| = o_p(1)\).

For \(\Omega _{(m)1,1}\), note that

$$\begin{aligned} \begin{aligned}&Var\left( \eta _{is(m)}\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\le h_{n0}) \right) \\&\quad \le {} E\left[ \eta _{is(m)}^2\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\le h_{n0}) \right] \\&\quad \le {} nE\left[ |\psi _\tau \left( \epsilon _{i(m)}^* -n^{-1/2} \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta _s\right) \right. \\&\qquad \left. -\psi _\tau \left( \epsilon _{i(m)}^* \right) |L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^{+2} \right] \\&\quad \le {} C_1n^{1/2}\underline{c}_{B(m)}^{-1}K_{(m)}^2 \end{aligned} \end{aligned}$$

for some positive \(C_1 < \infty \). By Boole’s and Bernstein’s inequality, we have

$$\begin{aligned} \begin{aligned}&P\left( \max _{1\le s \le N_g} \left\| \Omega _{(m)1,1} \right\| \ge \epsilon \right) \\&\quad \le {} N_g \max _{1\le s \le N_g} P\left( \left\| n^{-1}\sum _{i=1}^{n}\eta _{is(m)}\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\le h_{n0}) \right\| \ge \epsilon \right) \\&\quad \le {}2 N_g\exp \left( -\frac{n\epsilon ^2}{2C_1n^{1/2}\underline{c}_{B(m)}^{-1}K_{(m)}^2+ 4\epsilon n^{1/2}h_{n0}/3} \right) \\&\quad \le {}2 \exp \left( -\frac{n\epsilon ^2}{2C_1n^{1/2}\underline{c}_{B(m)}^{-1}K_{(m)}^2+ 4\epsilon n^{1/2}h_{n0}/3} \right) \\&\quad \le {} 2 \exp (3K_{(m)}\log n) \times \exp (-4K_{(m)}\log n) \\&\quad ={} 2 \exp (-K_{(m)}\log n) = o(1), \end{aligned} \end{aligned}$$

as \(n/(\underline{c}_{B(m)}^{-1}n^{1/2}K_{(m)}^2) = n^{1/2}\underline{c}_{B(m)}K_{(m)}^{-2} \gg K_{(m)}\log n\) and \(n/(\epsilon n^{1/2}h_{n0}) = n^{3/8}K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2} \gg K_{(m)}\log n\) by condition (C9).

For \(\Omega _{(m)1,2}\), by Boole’s and Markov’s inequalities, and the fact that \(E\left[ |\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)}) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) |^8 \right] = O(K_{(m)}^4\underline{c}_{B(m)}^{-4})\) by similar arguments in (A9), the fact that \(E\left[ |b_{i(m)} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2} |^8 \right] = O(1)\), and condition (C1),

$$\begin{aligned} \begin{aligned}&P\left( \max _{1\le s \le N_g} \left\| \Omega _{(m)1,2} \right\|> \epsilon \right) \le P\left( \max _{1\le i \le n} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+> h_{n0} \right) \\&\quad \le {} nP\left( |\bar{b}_{i(m)} |> K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2}h_{n0} \right) \\&\quad \le {} \frac{nK_{(m)}^4\underline{c}_{B(m)}^{-4}}{h_{n0}^8}E\left[ |\bar{b}_{i(m)} |^8 \varvec{1}\left( \left|\bar{b}_{i(m)}\right|> K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2}h_{n0}\right) \right] = o(1), \end{aligned} \end{aligned}$$

where \( \bar{b}_{i(m)} = b_{i(m)} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2}\). Thus \(\Omega _{(m)1} = o_p(1)\) has been shown. By the same technique, we can shown that \(\Omega _{(m)3} = o_p(1)\). Therefore (A6) follows.

Now, we show (A7). By conditions (C1), (C5)–(C9),

$$\begin{aligned}{} & {} \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| \bar{V}_{(m)}(\Delta )- \bar{V}_{(m)}(0) + A_{(m)}\Delta \right\| _{\varvec{d}_{(m)}}\\{} & {} \quad ={}\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| n^{-1/2}\sum _{i=1}^{n}E\left\{ \left[ F\left( -a_{i(m)}+n^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta |\varvec{X}_i,\varvec{Z}_i\right) \right. \right. \right. \\{} & {} \qquad \left. \left. \left. - F\left( -a_{i(m)} |\varvec{X}_i,\varvec{Z}_i\right) \right] \varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\} - A_{(m)}\Delta \right\| _{\varvec{d}_{(m)}} \\{} & {} \quad ={}\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| n^{-1}\sum _{i=1}^{n}E\left\{ \int _{0}^{1}\left[ f\left( -a_{i(m)}+sn^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta |\varvec{X}_i,\varvec{Z}_i\right) \right. \right. \right. \\{} & {} \qquad \left. \left. \left. - f\left( -a_{i(m)} |\varvec{X}_i,\varvec{Z}_i\right) \right] ds \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\} \right\| _{\varvec{d}_{(m)}} \\{} & {} \quad \le {} C \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L }n^{-3/2}\sum _{i=1}^{n}E\left\| \Delta ^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\right. \\{} & {} \qquad \left. \times \Delta L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\| _{\varvec{d}_{(m)}} \\{} & {} \quad \le {} C n^{-1/2}\underline{c}_{B(m)}^{-1/2}\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L }\left( E\left\| \Delta ^T\varvec{B}_{(m)}(\varvec{X}_{i(m)}) L^{1/2}(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\| ^2\right) ^{1/2} \\{} & {} \qquad \times \left( E\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta L^{1/2}(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\| ^2\right) ^{1/2}\\{} & {} \quad ={}n^{-1/2}\underline{c}_{B(m)}^{-1/2}O(\bar{c}_{A(m)}^{1/2}K_{(m)}^{1/2})O(K_{(m)}^{3/2})= O(n^{-1/2}\underline{c}_{B(m)}^{-1/2}\bar{c}_{A(m)}^{1/2}K_{(m)}^{2}) = o(1). \end{aligned}$$

Finally, we show (A8). By conditions (C1)–(C7) and the proof of Lemma A2 in Ruppert and Carroll (1980) (see, Welsh (1989)), we have

$$\begin{aligned}&\left\| V_{(m)}(\hat{\Delta }_{(m)})\right\| _{\varvec{d}_{(m)}}\nonumber \\&\quad = {}\left\| n^{-1/2}\sum _{i=1}^{n}\psi _\tau \left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)}) \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right\| _{\varvec{d}_{(m)}}\nonumber \\&\quad \le {}n^{-1/2}\sum _{i=1}^{n}\varvec{1}\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)})=0 \right) \left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\nonumber \\&\quad \le {} n^{-1/2}K_{(m)}\max _{1\le i \le n}\left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}). \end{aligned}$$
(A14)

Since for any \(\epsilon > 0\), by Boole’s and Markov inequalities and conditions (C7)–(C9),

$$\begin{aligned} \begin{aligned}&P\left( n^{-1/2}K_{(m)}\max _{1\le i \le n}\left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})> \epsilon \right) \\&\quad \le {} n P\left( \left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) > n^{1/2}K_{(m)}^{-1}\epsilon \right) \\&\quad \le {} n^{-3}K_{(m)}^{8} E\left( \left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|^8 L^8(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right) \\&\quad = {} n^{-3}K_{(m)}^{8}O(K_{(m)}^4\underline{c}_{B(m)}^{-4}) = O(n^{-3}K_{(m)}^{12}\underline{c}_{B(m)}^{-4}) =o(1), \end{aligned} \end{aligned}$$

combining (A14) holds. This complete the proof of part (ii).

\(\square \)

1.2 P2. Proof of Theorem 2

Proof

We only prove (i) as the proof of (ii) is similar. Let \(\delta _{n} = L\sqrt{n^{-1}\bar{K}\log n}\) for some large positive constant \(L< \infty \). Let

$$\begin{aligned} \bar{L}_{(m)}(\varvec{\beta }_{(m)}) = E\left[ \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }_{(m)}\right) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right] . \end{aligned}$$

Define

$$\begin{aligned} \mathcal {D}(\delta _{n}, \varvec{z}_{(m)}) \triangleq \inf _{1\le m \le M} \inf _{\left\| \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right\| >\delta _n }\left[ \bar{L}_{(m)}(\varvec{\beta }_{(m)}) - \bar{L}_{(m)}(\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})) \right] , \end{aligned}$$
(A15)

where \(\mathcal {C}(\delta _n, \varvec{z}_{(m)}) {\equiv } \left\{ \varvec{\beta }_{(m)}:\left\| \varvec{\beta }_{(m)}{-}\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right\| {>}\delta _n, \left\| \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right\| {=}o(1) \right\} \). By \(a_{i(m)} {=} \mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\), Knight’s identity and condition (C6), for any \(\varvec{\beta }_{(m)}\in \mathcal {C}(\delta _n, \varvec{z}_{(m)})\), we have

$$\begin{aligned}&\bar{L}_{(m)}(\varvec{\beta }_{(m)}) - \bar{L}_{(m)}(\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}))\\&\quad ={} E\left\{ \left[ \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }_{(m)}\right) -\rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \right] \right. \\&\qquad \left. \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)},\varvec{\lambda }_{(m)})\right\} \\&\quad ={}E\left\{ \left[ \rho _{\tau }\left( \epsilon _i + a_{i(m)} - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \right) -\rho _{\tau }\left( \epsilon _i + a_{i(m)}\right) \right] \right. \\&\qquad \left. \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right\} \\&\quad ={}E\left\{ \int _{0}^{\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) } \left[ \varvec{1}(\epsilon _i + a_{i(m)}\le s)-\varvec{1}(\epsilon _i + a_{i(m)}\le 0)\right] ds \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right\} \\&\quad ={}E\left\{ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \int _{0}^{\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) } \left[ F( -a_{i(m)}+s|\varvec{X}_i,\varvec{Z}_i) -F( -a_{i(m)}|\varvec{X}_i,\varvec{Z}_i)\right] ds \right\} \\&\quad \simeq {} \frac{1}{2}\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) ^TA_{(m)}\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \ge \frac{\underline{c}_A\delta _n^2}{2}. \end{aligned}$$

Then, by Boole’s inequality, \(\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)})\) and \(\mathcal {C}(\delta _n, \varvec{Z}_{i(m)})\) and the fact that

$$\begin{aligned}{} & {} \bar{L}_{(m)}(\hat{\varvec{\beta }}_{i(m)}) - \bar{L}_{(m)}(\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)}) \\{} & {} \qquad \simeq \frac{1}{2}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) ^TA_{(m)}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \end{aligned}$$

for \(i =1,\ldots ,n\), we have

$$\begin{aligned}&P\left( \max _{1\le i \le n}\max _{1\le m \le M} \left\| \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| \ge \delta _n\right) \nonumber \\&\quad \le {}nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \left\| \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| \ge \delta _n\right) \nonumber \\&\quad \le {}nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \bar{L}_{(m)}(\hat{\varvec{\beta }}_{i(m)}) - \bar{L}_{(m)}(\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)}) \ge \mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)}) \right) \nonumber \\&\quad \approx {}nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \mathbb {F}_{i(m)} \ge 2n\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)}) \right) \end{aligned}$$
(A16)

where \(\mathbb {F}_{i(m)} = n\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) ^TA_{(m)}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \) Thus the crucial step is to bound \(P\left( \mathbb {F}_{i(m)} \ge 2n\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)}) \right) \).

Following the proof of Theorem 1 (ii), we can also obtain that

\(\sqrt{n}\varvec{D}_{(m)}^TC_{(m)}^{-1/2}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) {\mathop {\longrightarrow }\limits ^{d}} N(0, 1)\). Write

\(\tilde{\eta }_{i(m)} \triangleq \sqrt{n}\varvec{D}_{(m)}^TC_{(m)}^{-1/2}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) {\mathop {\longrightarrow }\limits ^{d}} N(0, 1)\) and we have

\(\sqrt{n}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) = \left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^+\tilde{\eta }_{i(m)}\). Thus for \(\mathbb {F}_{i(m)}\),

$$\begin{aligned} \mathbb {F}_{i(m)} = {}&n\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) ^TA_{(m)}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \\ = {}&\tilde{\eta }_{i(m)}^T\left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^{+T}A_{(m)}\left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^+\tilde{\eta }_{i(m)}\\ \le {}&\lambda _{max}(A_{(m)})\tilde{\eta }_{i(m)}^T\left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^{+T}\left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^+\tilde{\eta }_{i(m)}\\ = {}&\lambda _{max}(A_{(m)})\tilde{\eta }_{i(m)}^T\left( \varvec{D}_{(m)}^TC_{(m)}^{-1}\varvec{D}_{(m)}\right) ^{-1}\tilde{\eta }_{i(m)}\\ \le {}&\lambda _{max}(A_{(m)})\lambda _{max}(C_{(m)})\left\| \tilde{\eta }_{i(m)}\right\| ^2 \le \frac{\bar{c}_{A}\bar{c}_{B}}{\underline{c}_{A}^2}\left\| \tilde{\eta }_{i(m)}\right\| ^2 \end{aligned}$$

where we have used the fact that \(A^{+T}A^+ = (AA^T)^+\) ((Bernstein 2005) Proposition 6.1.6xvii) by conditions (C6) - (C7). Let \(c_{AB} = \frac{\bar{c}_{A}\bar{c}_{B}}{\underline{c}_{A}^2}\), then according to the Lemma 2.1 of Shibata (1981), we have

$$\begin{aligned}&nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \mathbb {F}_{i(m)} \ge 2n\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)}) \right) \nonumber \\&\quad \le {} nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \left\| \tilde{\eta }_{i(m)}\right\| ^2 \ge 2n\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)})/c_{AB} \right) \nonumber \\&\quad \le {} \limsup _{n\rightarrow \infty }nM \max _{1\le m \le M}P\left( \chi ^2(1) \ge 1 + \left[ n\underline{c}_A\delta _n^2/c_{AB}-1\right] \right) \nonumber \\&\quad \le {} \limsup _{n\rightarrow \infty } nM \exp \left( -\frac{\left[ n\underline{c}_A\delta _n^2/c_{AB}-1\right] }{2} \times \left[ 1- \frac{\log ( n\underline{c}_A\delta _n^2/c_{AB})}{ n\underline{c}_A\delta _n^2/c_{AB} -1}\right] \right) =0 \end{aligned}$$
(A17)

as \(nM\exp \left( -0.5 n\underline{c}_A\delta _n^2/c_{AB}\right) = nMn^{-0.5L^2\bar{K}\underline{c}_{A}^3/(\bar{c}_{A} \bar{c}_{B})} = o(1) \) for large enough L and \(\left[ \log ( n\underline{c}_A\delta _n^2/c_{AB})\right] / \left[ n\underline{c}_A\delta _n^2/c_{AB} -1\right] = o(1)\) under our conditions. Thus under (A16)–(A17), (i) follows. \(\square \)

1.3 P3. Proof of Theorem 3

Proof

Following Li (1987) and Lu and Su (2015), it suffices to show that

$$\begin{aligned} \sup _{\varvec{w}\in \mathcal {W}}\left|\frac{CV_n(\varvec{w})-FPE_n(\varvec{w})}{FPE_n(\varvec{w})} \right|= o_p(1). \end{aligned}$$
(A18)

Let \(E_i\) and \(E_{x,z}\) be the expectations with respect to \((\varvec{X}_i, \varvec{Z}_i)\) and \((\varvec{X}, \varvec{Z})\), respectively. By the fact that

$$\begin{aligned} \begin{aligned}&E\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\mu }\left[ F(s|\varvec{X},\varvec{Z}) -F(0|\varvec{X},\varvec{Z}) \right] ds |\mathcal {A}_n \right\} \\&\quad ={} E_i\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{i(m)})-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \end{aligned} \end{aligned}$$

and Knight’s identity, we have

$$\begin{aligned}&CV_n(\varvec{w})-FPE_n(\varvec{w}) \nonumber \\&\quad ={} \left\{ \frac{1}{n}\sum _{i=1}^{n}\left[ \rho _\tau \left( Y_i-\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)} \right) -\rho _\tau (\epsilon _i) \right] \right\} \nonumber \\&\qquad -\left\{ FPE_n(\varvec{w}) - E[\rho _\tau (\epsilon )]\right\} + \frac{1}{n}\sum _{i=1}^{n}\left\{ \rho _\tau (\epsilon _i)-E[\rho _\tau (\epsilon )] \right\} \nonumber \\&\quad ={} \frac{1}{n}\sum _{i=1}^{n}\left[ \mu _i -\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)} \right] \psi (\epsilon _i) \nonumber \\&\qquad + \frac{1}{n}\sum _{i=1}^{n} \int _{0}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ \varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0) \right] ds \nonumber \\&\qquad -E\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\mu }\left[ \varvec{1}(\epsilon \le s) -\varvec{1}(\epsilon \le 0) \right] ds |\mathcal {A}_n \right\} \nonumber \\&\qquad +\frac{1}{n}\sum _{i=1}^{n}\left\{ \rho _\tau (\epsilon _i)-E[\rho _\tau (\epsilon )] \right\} \nonumber \\&\quad \triangleq {} CV_{n1}(\varvec{w}) + CV_{n2}(\varvec{w}) + CV_{n3}(\varvec{w}) + CV_{n4}(\varvec{w}) + CV_{n5}(\varvec{w}) \end{aligned}$$
(A19)

where

$$\begin{aligned} CV_{n1}(\varvec{w}) \triangleq&\,\frac{1}{n}\sum _{i=1}^{n}\left[ \mu _i -\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)} \right] \psi (\epsilon _i),\\ CV_{n2}(\varvec{w}) \triangleq {}&\frac{1}{n}\sum _{i=1}^{n} \int _{0}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ \varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0) \right. \\&\left. - F(s|\varvec{X}_i, \varvec{Z}_i) + F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds,\\ CV_{n3}(\varvec{w}) \triangleq {}&\frac{1}{n}\sum _{i=1}^{n}\left[ \int _{0}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ F(s|\varvec{X}_i, \varvec{Z}_i) - F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \right. \\&\left. - E_i\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \right] ,\\ CV_{n4}(\varvec{w}) \triangleq {}&\frac{1}{n}\sum _{i=1}^{n} E_i\left[ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right. \\&\left. - E_i\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{i(m)})-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \right] , \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} CV_{n5}(\varvec{w}) \triangleq \frac{1}{n}\sum _{i=1}^{n}\left\{ \rho _\tau (\epsilon _i)-E[\rho _\tau (\epsilon )] \right\} . \end{aligned} \end{aligned}$$

In order to prove (A18), we need to show that (i) \(\min _{\varvec{w}\in \mathcal {W}}FPE_n(\varvec{w}) \ge E[\rho _\tau (\epsilon )] - o_p(1)\); (ii) \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n1}(\varvec{w})|= o_p(1)\); (iii) \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n2}(\varvec{w})|= o_p(1)\); (iv) \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n3}(\varvec{w})|= o_p(1)\); (v) \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n4}(\varvec{w})|= o_p(1)\); (vi) \(|CV_{n5}(\varvec{w})|= o_p(1)\). (vi) follows by the weak law of large numbers so we just show (i)–(v).

We first prove (i). Let \(u(\varvec{w}) \triangleq \mu -\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\).

By Knight’s identity, Taylor expansion, Jensen inequality, conditions (C5), (C6) and (C9), Theorem 2, and the fact that \(E [\psi _\tau (\epsilon +u(\varvec{w}))] = 0\) by the first order condition for the population minimization problem (10), we have

$$\begin{aligned}&FPE_n(\varvec{w}) - E[\rho _\tau (\epsilon +u(\varvec{w}))] \\&\quad ={} E\left\{ \rho _\tau \left( \epsilon +u(\varvec{w}) - \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) \right) \right. \\&\qquad \left. - \rho _\tau (\epsilon +u(\varvec{w}))|\mathcal {A}_n \right\} \\&\quad ={} E\left\{ \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) }\varvec{1}\left( \epsilon +u(\varvec{w}) \le s \right) \right. \\&\qquad \left. - \varvec{1}(\epsilon +u(\varvec{w})\le 0) ds|\mathcal {D}_n \right\} \\&\quad ={} E_{x,z}\left\{ \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) } F\left( s-u(\varvec{w})|\varvec{X},\varvec{Z} \right) \right. \\&\qquad \left. - F\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) ds \right\} \\&\quad ={} E_{x,z}\left\{ \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) } f\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) s ds \right\} + o_p(1)\\&\quad ={} \frac{1}{2}E_{x,z}\left\{ f\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) \left[ \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) \right] ^2\right\} + o_p(1)\\&\quad \le {} \frac{1}{2}E_{x,z}\left\{ f\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) \sum _{m=1}^{M}w_m \left[ \varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) \right] ^2\right\} + o_p(1)\\&\quad \le {} \frac{1}{2}\left\{ \sum _{m=1}^{M}w_m \left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) ^T\right. \\&\quad E[f\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) \varvec{B}_{(m)}(\varvec{X}_{(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})]\\&\qquad \left. \times \left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) \right\} + o_p(1)\\&\quad \le {} \frac{\bar{c}_A}{2}\max _{1\le m \le M}\left\| \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right\| ^2 + o_p(1)\\&\quad ={}o_p(1). \end{aligned}$$

Define \(H(t) \triangleq E[\rho _\tau (\epsilon + t)- \rho _\tau (\epsilon )]\) where \(t \in R\). It can be seen that H(t) has a global minimum at \(t = 0\). We obtain that \(\min _{\varvec{w}\in \mathcal {W}} E[\rho _\tau (\epsilon +u(\varvec{w}))] \ge E[\rho _\tau (\epsilon )]\). As a result, we have

$$\begin{aligned} \begin{aligned} \min _{\varvec{w}\in \mathcal {W}}FPE_n(\varvec{w}) ={}&\min _{\varvec{w}\in \mathcal {W}} E[\rho _\tau (\epsilon +u(\varvec{w}))] - o_p(1)\\ \ge {}&E[\rho _\tau (\epsilon )] - o_p(1). \end{aligned} \end{aligned}$$

(ii) For \(CV_{n1}(\varvec{w})\), we decompose it as follows

$$\begin{aligned} \begin{aligned} CV_{n1}(\varvec{w}) ={}&\frac{1}{n}\sum _{i=1}^{n}\left[ \mu _i -\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right] \psi (\epsilon _i)\\&- \frac{1}{n}\sum _{i=1}^{n}\left[ \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right] \psi (\epsilon _i)\\ \triangleq {}&CV_{n1,1}(\varvec{w}) - CV_{n1,2}(\varvec{w}) \end{aligned} \end{aligned}$$

It can be seen that \(E[CV_{n1,1}(\varvec{w})] = 0\), and \(Var[CV_{n1,1}(\varvec{w})] = O(\bar{K}/n)\). We have \(CV_{n1,1}(\varvec{w}) = o_p(1)\) for each \(\varvec{w} \in \mathcal {W}\). If both M and \(\bar{K} = \max _{1\le m \le M}K_m\) are finite by condition (C7), by Glivenko-Cantelli theorem (Theorem 2.4.1 in Van der Vaart and Wellner (1996)), we have \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n1,1}(\varvec{w})|= o_p(1)\). To be specific, consider the class of functions \(\mathcal {F} = \left\{ f(\cdot ,\cdot ,\cdot ;\varvec{w}): \varvec{w}\in \mathcal {W}\right\} \), where \( f(\cdot ,\cdot ,\cdot ;\varvec{w}): R\times \mathcal {S}_{X}\times \mathcal {S}_{Z} \rightarrow R\) is defined by

$$\begin{aligned} f(\epsilon _i,\varvec{X}_{i},\varvec{Z}_{i};\varvec{w}) = \left[ \mu _i -\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)}) \right] \psi (\epsilon _i). \end{aligned}$$

We define the metrics \(|\cdot |_1\) on \(\mathcal {W}\), where \(|\varvec{w}_1-\varvec{w}_2|_1 = \sum _{m=1}^{M}|w_{m1}-w_{m2}|\) for any \(\varvec{w}_1 = (w_{11},w_{21},\ldots ,w_{M1})\) and \(\varvec{w}_2 = (w_{12},w_{22},\ldots ,w_{M2})\) \(\in \mathcal {W}\). We can easily obtain that the \(\epsilon -\)covering number of \(\mathcal {W}\) with respect to \(|\cdot |_1\) is given by \(N(\epsilon ) = O(1/\epsilon ^{M-1})\). According to the fact that

$$\begin{aligned} \begin{aligned}&|f(\epsilon _i,\varvec{X}_{i},\varvec{Z}_{i};\varvec{w}_1) - f(\epsilon _i,\varvec{X}_{i},\varvec{Z}_{i};\varvec{w}_2) |\\&\quad ={}\left|\sum _{m=1}^{M}(w_{m1} - w_{m2}) \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\psi (\epsilon _i) \right|\\&\quad \le {} c_{\beta }\left|\varvec{w}_1 - \varvec{w}_2\right|_1 \max _{1\le m \le M}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| , \end{aligned} \end{aligned}$$

where \(c_{\beta } = \max _{1\le m \le M}\left\| \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| = O(\bar{K}^{1/2})\) and that \(\max _{1\le m \le M}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| <\infty \) in the case of finite M and \(\bar{K}\), together with Theorem 2.7.11 in Van der Vaart and Wellner (1996), we have the \(\epsilon \)-bracketing num- ber of \(\mathcal {F}\) with respect to the \(L_1(P)-\)norm is given by \(N(\epsilon , L_1(P))\le C/\epsilon ^{M-1}\) for some finite C. Thus, we have \(\mathcal {F}\) is Glivenko–Cantelli by Theorem 2.4.1 in Van der Vaart and Wellner (1996).

When either \(M \rightarrow \infty \) or \(\bar{K} \rightarrow \infty \) as \(n \rightarrow \infty \), the above proof is invalid. To allow for diverging M and \(\bar{K}\), let \(r_n = (\bar{K}\log n)^{-1}\). We build grids using regions of the form \(\textbf{W}_j = \left\{ \varvec{w}:|\varvec{w} - \varvec{w}_j|_1 \le r_n\right\} \). By choosing \(\varvec{w}_j = (w_{1j},\ldots ,w_{Mj})^T\), \(\mathcal {W}\) can be covered with \(N_{w} = O(1/h_n^{M-1})\) regions \(\textbf{W}_j\), \(j = 1,\ldots , N_{w}\). It can be seen that

$$\begin{aligned} \begin{aligned}&\sup _{\varvec{w}\in \textbf{W}_j}\left|CV_{n1,1}(\varvec{w}) - CV_{n1,1}(\varvec{w}_j)\right|\\&\quad ={}\sup _{\varvec{w}\in \textbf{W}_j}\left|\sum _{m=1}^{M}(w_{m} - w_{mj})\frac{1}{n}\sum _{i=1}^{n} \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\psi (\epsilon _i) \right|\\&\quad \le {}O_p(\bar{K}) \sup _{\varvec{w}\in \textbf{W}_j}\sum _{m=1}^{M}|w_{m} - w_{mj}|\\&\quad \le {}O_p(\bar{K})r_n = o_p(1), \end{aligned} \end{aligned}$$

where the result holds uniformly in \(\textbf{W}_j\). Therefore, we have

$$\begin{aligned} \begin{aligned}&\sup _{\varvec{w}\in \mathcal {W}}|CV_{n1,1}(\varvec{w})|\\&\quad ={} \max _{1\le j \le N_{w}} \sup _{\varvec{w}\in \textbf{W}_j}|CV_{n1,1}(\varvec{w})|\\&\quad \le {} \max _{1\le j \le N_{w}} |CV_{n1,1}(\varvec{w}_j)|+\max _{1\le j \le N_{w}} \sup _{\varvec{w}\in \textbf{W}_j}|CV_{n1,1}(\varvec{w})-CV_{n1,1}(\varvec{w}_j)|\\&\quad ={} \max _{1\le j \le N_{w}} |CV_{n1,1}(\varvec{w}_j)|+ o_p(1). \end{aligned} \end{aligned}$$

We are going to use the fact that

$$\begin{aligned} \begin{aligned}&\max _{1\le m \le M}\frac{1}{n}\sum _{i=1}^{n}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| \\&\quad \le {}\max _{1\le m \le M}\frac{1}{n}\sum _{i=1}^{n}E\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| \\&\qquad + \max _{1\le m \le M}\left|\frac{1}{n}\sum _{i=1}^{n}\left( \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| - E\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| \right) \right|\\&\quad ={}O(\bar{K}^{1/2}) + o_p(1) = O_p(\bar{K}^{1/2}). \end{aligned} \end{aligned}$$

Let \(h_n = (Mn\bar{K}^2)^{1/2}\). For \(CV_{n1,1}(\varvec{w}_j)\), we have \(\max _{1\le j \le N_{w}} |CV_{n1,1}(\varvec{w}_j)|\le \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)} \right|\), where \(u_{i(m)} = \left( \mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \psi _\tau (\epsilon _i)\) and \(E(u_{i(m)}) = 0\). Thus for any \(\epsilon > 0\), we have

$$\begin{aligned} \begin{aligned}&P\left( \max _{1\le j \le N_{w}} |CV_{n1,1}(\varvec{w}_j)|\ge 2\epsilon \right) \\&\quad \le {} P\left( \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)} \right|\ge 2\epsilon \right) \\&\quad \le {} P\left( \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)}\varvec{1}(|u_{i(m)}|\le h_n) \right|\ge \epsilon \right) \\&\qquad + P\left( \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)}\varvec{1}(|u_{i(m)}|> h_n) \right|\ge \epsilon \right) \\&\quad \triangleq {} R_{n1} + R_{n2}. \end{aligned} \end{aligned}$$

Since by the fact that \(A^TBA \le \lambda _{max}(B)A^TA\) for any real symmetric matrix B and suitable A,

$$\begin{aligned} \begin{aligned}&Var(u_{i(m)}\varvec{1}(|u_{i(m)}|\le h_n))\le E(u_{i(m)}\varvec{1}(|u_{i(m)}|\le h_n))^2\\&\quad \le {} 2E(\mu _i^2\psi _\tau ^2(\epsilon _i)) + 2\varvec{\beta }^{*T}_{(m)}(\varvec{Z}_{i(m)})E[\varvec{B}_{(m)}(\varvec{X}_{i(m)}) \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})]\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)}) \le c_{\bar{K}}\bar{K} \end{aligned} \end{aligned}$$

for some positive constant \(c_{\bar{K}} < \infty \), by Bernstein’s and Boole’s inequalities, and \(n^{-1}\bar{K}^2M(\log M)^2 = o_p(1)\), we have

$$\begin{aligned} R_{n1}\le & {} M\max _{1\le m \le M}P\left( \left|\sum _{i=1}^{n} u_{i(m)}\varvec{1}(|u_{i(m)}|\le h_n) \right|\ge n\epsilon \right) \\\le & {} 2M\exp \left( -\frac{n\epsilon ^2}{2c_{\bar{K}}\bar{K}+2\epsilon h_n/3} \right) \\\le & {} 2\exp \left( -\frac{n\epsilon ^2}{2c_{\bar{K}}\bar{K}+2\epsilon h_n/3} + \log M \right) = o(1). \end{aligned}$$

Similarly, by Markov’s and Boole’s inequalities and condition (C1), we have

$$\begin{aligned} R_{n2} \le {}&P\left( \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)}\varvec{1}(|u_{i(m)}|> h_n) \right|\ge \epsilon \right) \\ \le {}&P\left( \max _{1\le m \le M} \frac{1}{n}\sum _{i=1}^{n} \left|u_{i(m)} \right|\varvec{1}(|u_{i(m)}|> h_n) \ge \epsilon \right) \\ \le {}&P\left( \max _{1\le m \le M} \max _{1\le i \le n} |u_{i(m)}|> h_n \right) \\ \le {}&\sum _{i=1}^{n}\sum _{m=1}^{M}P\left( \left|\left( \mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \psi _\tau (\epsilon _i)\right|> h_n \right) \\ \le {}&\frac{1}{h_n^4}\sum _{i=1}^{n}\sum _{m=1}^{M}E\left[ \left|\mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right|^4 \psi _\tau ^4(\epsilon _i)\right. \\&\left. \times \varvec{1}\left( \left|\mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right|^4 \psi _\tau ^4(\epsilon _i) > h_n^4 \right) \right] \\ ={}&o(1). \end{aligned}$$

Thus we have \(\max _{1\le j \le N_{w}}|CV_{n1,1}(\varvec{w}_j)|= o_p(1)\) and \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n1,1}(\varvec{w})|= o_p(1)\).

By the triangle inequality and condition (C9), we have

$$\begin{aligned} \begin{aligned}&\sup _{\varvec{w}\in \mathcal {W}} |CV_{n1,2}(\varvec{w})|\\&\quad \le {} \sup _{\varvec{w}\in \mathcal {W}}\sum _{m=1}^{M}w_m\frac{1}{n}\sum _{i=1}^{n}\left|\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \psi (\epsilon _i)\right|\\&\quad \le {} \max _{1\le i \le n}\max _{1\le m \le M}\left\| \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| \max _{1\le m \le M} \frac{1}{n}\sum _{i=1}^{n}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| \\&\quad ={}O_p\left( \sqrt{n^{-1}\bar{K}\log n}\right) O_p(\bar{K}^{1/2}) = o_p(1). \end{aligned} \end{aligned}$$

Thus \(\sup _{\varvec{w}\in \mathcal {W}} |CV_{n1,2}(\varvec{w})|= o_p(1)\).

(iii) For \(CV_{n2}(\varvec{w})\), we can see that

$$\begin{aligned}{} & {} CV_{n2}(\varvec{w})\\{} & {} \quad = {} \frac{1}{n}\sum _{i=1}^{n} \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i} \left[ \varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0) \right. \\{} & {} \qquad \left. - F(s|\varvec{X}_i, \varvec{Z}_i) + F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \\{} & {} \qquad + \frac{1}{n}\sum _{i=1}^{n} \int _{ \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ \varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0) \right. \\{} & {} \qquad \left. - F(s|\varvec{X}_i, \varvec{Z}_i) + F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \\{} & {} \quad ={} CV_{n2,1}(\varvec{w})+ CV_{n2,2}(\varvec{w}) \end{aligned}$$

Observing that \(E[CV_{n2,1}(\varvec{w})] = 0\) and \(Var[CV_{n2,1}(\varvec{w})] = O(\bar{K}/n)\), we have \(CV_{n2,1}(\varvec{w}) = o_p(1)\) for each \(\varvec{w} \in \mathcal {W}\). Similar to the proof of \(CV_{n1,1}(\varvec{w})\), \(\sup _{\varvec{w}\in \mathcal {W}} CV_{n2,1}(\varvec{w}) = o_p(1)\).

Considering the fact that \(\left|\varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0)- F(s|\varvec{X}_i, \varvec{Z}_i) + F(0|\varvec{X}_i, \varvec{Z}_i) \right|\le 2\), we have

$$\begin{aligned}{} & {} CV_{n2,2}(\varvec{w})\nonumber \\{} & {} \quad \le {}\frac{2}{n}\sum _{i=1}^{n} \left|\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\nonumber \\{} & {} \quad \le {}2\max _{1\le i \le n}\max _{1\le m \le M}\left\| \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| \frac{1}{n}\sum _{i=1}^{n}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| \nonumber \\{} & {} \quad ={}O_p\left( \sqrt{n^{-1}\bar{K}\log n}\right) O_p(\bar{K}^{1/2}) = o_p(1). \end{aligned}$$

(iv) For \(CV_{n3}(\varvec{w})\), we have

$$\begin{aligned} CV_{n3}(\varvec{w})= & {} \frac{1}{n}\sum _{i=1}^{n}\left[ \int _{0}^{ \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i} \left[ F(s|\varvec{X}_i, \varvec{Z}_i) - F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \right. \\{} & {} \left. - E_i\left\{ \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \right] \\{} & {} + \frac{1}{n}\sum _{i=1}^{n}\left[ \int _{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ F(s|\varvec{X}_i, \varvec{Z}_i) - F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \right. \\{} & {} \left. - E_i\left\{ \int _{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \right] \\{} & {} \triangleq {} CV_{n3,1}(\varvec{w}) + CV_{n3,2}(\varvec{w}) \end{aligned}$$

The proof of \(\sup _{\varvec{w}\in \mathcal {W}} |CV_{n3,1}(\varvec{w})|= o_p(1)\) is similar to that of \(\sup _{\varvec{w}\in \mathcal {W}} |CV_{n1,1}(\varvec{w})|= o_p(1)\). Since \(|F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i)|\le 1\), we have

$$\begin{aligned}{} & {} |CV_{n3,2}(\varvec{w})|\\{} & {} \quad \le {} \frac{1}{n}\sum _{i=1}^{n} \left|\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\\{} & {} \qquad +\frac{1}{n}\sum _{i=1}^{n} E_i \left|\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\\{} & {} \quad \triangleq {} CV_{n3,21}(\varvec{w}) + CV_{n3,22}(\varvec{w}). \end{aligned}$$

\(CV_{n3,21}(\varvec{w})\) is similar to \(CV_{n2,2}(\varvec{w})\). For \(CV_{n3,22}(\varvec{w})\), by the Cauchy-Schwarz and triangle inequalities, Theorem 1, and the fact that \(A^TBA\le \lambda _{max}(B)A^TA\) for any real symmetric matrix B, we can see that

$$\begin{aligned}&\sup _{\varvec{w}\in \mathcal {W}} CV_{n3,22}(\varvec{w})\\&\quad \le {} \sup _{\varvec{w}\in \mathcal {W}}\frac{1}{n}\sum _{i=1}^{n} \sum _{m=1}^{M}w_mE_i \left|\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\\&\quad \le {} \sup _{\varvec{w}\in \mathcal {W}}\frac{1}{n}\sum _{i=1}^{n} \sum _{m=1}^{M}w_m \left[ \left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) ^T E_i\left( \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\right) \right. \\&\qquad \left. \times \left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right] ^{1/2} \\&\quad \le {} \max _{1\le m \le M} \left[ \lambda _{max}\left( E_i\left( \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\right) \right) \right] ^{1/2} \\&\qquad \times \max _{1\le i \le n}\max _{1\le m \le M} \left\| \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| =o_p(1). \end{aligned}$$

Thus \(\sup _{\varvec{w}\in \mathcal {W}} |CV_{n3,2}(\varvec{w})|= o_p(1)\).

(v) For \(CV_{n4}(\varvec{w})\), since \(|F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i)|\le 1\), and referring to the proof of \(CV_{n3,22}(\varvec{w})\), we have

$$\begin{aligned} \begin{aligned}&\sup _{\varvec{w}\in \mathcal {W}}CV_{n4}(\varvec{w}) \\&\quad \le {} \sup _{\varvec{w}\in \mathcal {W}}\frac{1}{n}\sum _{i=1}^{n} E_i\left|\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\\&\quad ={} o_p(1). \end{aligned} \end{aligned}$$

Thus the theorem has been proved. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, X., Zhang, L. Jackknife model averaging for mixed-data kernel-weighted spline quantile regressions. Metrika (2023). https://doi.org/10.1007/s00184-023-00932-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00184-023-00932-2

Keywords

Mathematics Subject Classification

Navigation