Appendix A The proof of main theorems
In the following proofs, Knight’s (1998) identity will be used repeatedly:
$$\begin{aligned} \rho _\tau (u+v) - \rho _\tau (u) = v\psi _\tau (u) + \int _{0}^{-v}[\varvec{1}(u\le s) - \varvec{1}(u\le 0)]ds, \end{aligned}$$
where \(\psi _\tau (u) = \tau - \varvec{1}(u\le 0)\).
1.1 P1. Proof of Theorem 1
Proof
(i) Let \(\delta _n = \sqrt{K_{(m)}/n}\), and \(\varvec{v}_{(m)} \in R^{K_{(m)}}\)such that \(\left\| \varvec{v}_{(m)}\right\| = c\), where \(c > 0\) is a sufficiently large constant. Let
$$\begin{aligned} L_{n(m)}(\varvec{\beta }_{(m)}) = \sum _{i=1}^{n} \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }_{(m)}\right) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}), \end{aligned}$$
and we just need to show that for any given \(\epsilon > 0\) there exists a large constant c such that, for large enough n, we have
$$\begin{aligned} P\left[ \inf _{\left\| \varvec{v}_{(m)}\right\| = c} L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}\right) > L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \right] \ge 1- \epsilon , \end{aligned}$$
which implies with probability approaching 1 there exists a local min- imum \(\hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)})\) in the ball \(\left\{ \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}: \left\| \varvec{v}_{(m)}\right\| \le c\right\} \) such that \(\left\| \hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)}) -\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}) \right\| = O_p(\delta _n)\). According to the convexity of \(L_{n(m)}\), it is also the global minimum.
Let \(a_{i(m)} = \mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\). Then by knight’s identity, we have
$$\begin{aligned}&L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}\right) - L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \nonumber \\&\quad ={} \sum _{i=1}^{n} \left\{ \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}\right) \right) \right. \nonumber \\&\qquad \left. - \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \right\} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \nonumber \\&\quad ={} \sum _{i=1}^{n} \left\{ \rho _{\tau }\left( \epsilon _i + a_{i(m)} -\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}\right) - \rho _{\tau }\left( \epsilon _i + a_{i(m)}\right) \right\} \nonumber \\ {}&\times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \nonumber \\&\quad ={} \sum _{i=1}^{n} \left\{ -\delta _n\psi _\tau (\epsilon _i + a_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)} + \int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds \right\} \nonumber \\&\qquad \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \nonumber \\&\quad ={} -\delta _n\sum _{i=1}^{n} \psi _\tau (\epsilon _i + a_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \nonumber \\&\qquad + \sum _{i=1}^{n} E\left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds \right] \nonumber \\&\qquad + \sum _{i=1}^{n} \left\{ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds\right. \nonumber \\&\qquad \left. -E\left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds \right] \right\} \nonumber \\&\quad \triangleq {} A_{n{m},1}(\varvec{v}_{(m)}) + A_{n{m},2}(\varvec{v}_{(m)}) + A_{n{m},3}(\varvec{v}_{(m)}), \end{aligned}$$
(A1)
where \(\alpha _{i(m)}(s) = \varvec{1}(\epsilon _i + a_{i(m)}\le s) - \varvec{1}(\epsilon _i + a_{i(m)}\le 0)\).
By the first order condition for the population minimization problem (10), we have
$$\begin{aligned} E\left[ \psi _\tau (\epsilon _i + a_{i(m)})\varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)},\varvec{\lambda }_{(m)})\right] = 0. \end{aligned}$$
(A2)
By the condition (C7),
$$\begin{aligned} \begin{aligned}&E|A_{n{m},1}(\varvec{v}_{(m)}) |^2\\&\quad \le {} \delta _n^2\sum _{i=1}^{n}\varvec{v}_{(m)}^TE\left[ \psi _\tau ^2(\epsilon _i + a_{i(m)})\varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)},\varvec{\lambda }_{(m)})\right] \varvec{v}_{(m)}^T \\&\quad \le {} \bar{c}_{B(m)}n\delta _n^2\left\| \varvec{v}_{(m)} \right\| ^2 \end{aligned} \end{aligned}$$
and thus by Chebyshev’s inequality, we have
$$\begin{aligned} A_{n{m},1}(\varvec{v}_{(m)}) = O_p(\bar{c}_{B(m)}^{1/2}n^{1/2}\delta _n)\left\| \varvec{v}_{(m)} \right\| . \end{aligned}$$
(A3)
For \(A_{n{m},2}(\varvec{v}_{(m)})\), according to the law of iterated expectation, Taylor expansion and condition (C6), we have
$$\begin{aligned}&A_{n{m},2}(\varvec{v}_{(m)}) \nonumber \\&\quad ={}\sum _{i=1}^{n} E\left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}F(-a_{i(m)}+s |\varvec{X}_i,\varvec{Z}_i)\right. \nonumber \\&\qquad \left. - F(-a_{i(m)} |\varvec{X}_i,\varvec{Z}_i) ds \right] \nonumber \\&\quad ={}\sum _{i=1}^{n} E\left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}} f(-a_{i(m)} |\varvec{X}_i,\varvec{Z}_i)s ds \right] (1+o_p(1)) \nonumber \\&\quad ={}\frac{1}{2}\delta _n^2\varvec{v}_{(m)}^T\left( \sum _{i=1}^{n} E\left[ f(-a_{i(m)} |\varvec{X}_i,\varvec{Z}_i)L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \times \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \right] \right) \varvec{v}_{(m)}(1+o_p(1)) \nonumber \\&\quad ={}\frac{1}{2}\delta _n^2n\varvec{v}_{(m)}^TA_{(m)} \varvec{v}_{(m)}(1+o_p(1)) \ge \frac{\underline{c}_{A(m)}}{2}\delta _n^2n\left\| \varvec{v}_{(m)} \right\| ^2 \end{aligned}$$
(A4)
with probability approaching 1. It can be seen that \(E \left( A_{n{m},3}(\varvec{v}_{(m)})\right) = 0\), and by conditions (C7), we have
$$\begin{aligned} \begin{aligned}&Var \left( A_{n{m},3}(\varvec{v}_{(m)})\right) \\&\quad \le {}\sum _{i=1}^{n}E\left\{ L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\left[ \int _{0}^{\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)}}\alpha _{i(m)}(s) ds\right] ^2\right\} \\&\quad \le {}\sum _{i=1}^{n}E \left[ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\delta _n\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{v}_{(m)} \right] ^2 \\&\quad = {} n\delta _{n}^2\varvec{v}_{(m)}^TE \left[ L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \right] \varvec{v}_{(m)} \le \bar{c}_{B(m)}n\delta _{n}^2\left\| \varvec{v}_{(m)}\right\| ^2. \end{aligned} \end{aligned}$$
Thus we obtain
$$\begin{aligned} A_{n{m},3}(\varvec{v}_{(m)}) = O_p(\bar{c}_{B(m)}^{1/2}n^{1/2}\delta _{n})\left\| \varvec{v}_{(m)}\right\| . \end{aligned}$$
(A5)
By (A3)–(A5), and allowing \(\left\| \varvec{v}_{(m)} \right\| \) to be large enough, both \(A_{n{m},1}(\varvec{v}_{(m)})\) and \(A_{n{m},3}(\varvec{v}_{(m)})\) are dominated by \(A_{n{m},2}(\varvec{v}_{(m)})\) with probability approaching 1. Thus combing with (A1), we have
$$\begin{aligned} L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})+\delta _n\varvec{v}_{(m)}\right) - L_{n(m)}\left( \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) > 0 \end{aligned}$$
with probability approaching 1. This proves (i).
(ii) Let \(\hat{\Delta }_{(m)} = \sqrt{n}\left( \hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)}) - \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \) and \(\Delta _{(m)} = \sqrt{n}\left( \varvec{\beta }_{(m)} - \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \). It can be seen that
$$\begin{aligned} \hat{\Delta }_{(m)}= & {} \arg \min _{\Delta _{(m)}}\sum _{i=1}^{n}\rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left[ \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}) + n^{-1/2}\Delta _{(m)}\right] \right) \\{} & {} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}). \end{aligned}$$
Let \(V_{(m)}(\Delta ) = n^{-1/2}\sum _{i=1}^{n}\psi _\tau \left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left[ \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}) + n^{-1/2}\Delta \right] \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\) and \(\bar{V}_{(m)}(\Delta ) = E\left( V_{(m)}(\Delta ) \right) \). Define the weighted norm \(\left\| \cdot \right\| _{\varvec{d}_{(m)}}\) by
$$\begin{aligned} \left\| A \right\| _{\varvec{d}_{(m)}} = \left\| \varvec{d}_{(m)}^TA \right\| , \end{aligned}$$
where \(\varvec{d}_{(m)}\) is a \(K_{(m)} \times 1\) vector with \(\left\| \varvec{d}_{(m)} \right\| \le \underline{c}_{B(m)}^{-1/2}\).
We need to show for any large positive constant \(L < \infty \),
$$\begin{aligned}{} & {} \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}(\Delta ) - V_{(m)}(0) - \bar{V}_{(m)}(\Delta )+ \bar{V}_{(m)}(0) \right\| _{\varvec{d}_{(m)}} = o_p(1), \end{aligned}$$
(A6)
$$\begin{aligned}{} & {} \qquad \quad \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| \bar{V}_{(m)}(\Delta )- \bar{V}_{(m)}(0) + A_{(m)}\Delta \right\| _{\varvec{d}_{(m)}} = o_p(1), \end{aligned}$$
(A7)
$$\begin{aligned}{} & {} \left\| V_{(m)}(\hat{\Delta }_{(m)})\right\| _{\varvec{d}_{(m)}} = o_p(1). \end{aligned}$$
(A8)
By (A6)–(A7) and the result of part(i), we have \(\left\| V_{(m)}(\hat{\Delta }_{(m)}){-} V_{(m)}(0) {+} A_{(m)}\hat{\Delta }_{(m)} \right\| _{\varvec{d}_{(m)}} = o_p(1)\), Thus by conditions (C6)–(C7), we obtain \(\hat{\Delta }_{(m)}= A_{(m)}^{-1}V_{(m)}(0)-A_{(m)}^{-1}V_{(m)}(\hat{\Delta }_{(m)}) + A_{(m)}^{-1}R_{(m)}\), and
$$\begin{aligned} \begin{aligned}&\varvec{D}_{(m)}^TC_{(m)}^{-1/2}\hat{\Delta }_{(m)}=\sqrt{n}\varvec{D}_{(m)}^TC_{(m)}^{-1/2}\left( \hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)}) - \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \\&\quad ={}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}V_{(m)}(0)-\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}V_{(m)}(\hat{\Delta }_{(m)}) +\varvec{D}^T_{(m)}C_{(m)}^{-1/2} A_{(m)}^{-1}R_{(m)}\\&\quad \triangleq {} \mathbb {P}_{(m)1} -\mathbb {P}_{(m)2} +\mathbb {P}_{(m)3} \end{aligned} \end{aligned}$$
where \(\left\| R_{(m)}\right\| _{\varvec{d}_{(m)}} = o_p(1)\) for any \(\varvec{d}_{(m)}\) with \(\left\| \varvec{d}_{(m)} \right\| \le \underline{c}_{B(m)}^{-1/2}\), and \(C_{(m)}^{1/2}\) represents the symmetric square root of \(C_{(m)}\) and \(C_{(m)}^{-1/2}\) denotes the inverse of \(C_{(m)}^{1/2}\).
Let \(\zeta _{ni} = n^{-1/2}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}\psi _\tau \left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\), then \(\mathbb {P}_{(m)1} = \sum _{i=1}^{n}\zeta _{ni}\). By (A2), we have \(E(\zeta _{ni}) = 0\) and \(E(\mathbb {P}_{(m)1}) = 0\). Thus
$$\begin{aligned}{} & {} Var(\mathbb {P}_{(m)1}) = \sum _{i=1}^{n} Var(\zeta _{ni})\\{} & {} \quad ={}n^{-1}\sum _{i=1}^{n}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1} E\left[ \psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})\right. \\{} & {} \qquad \left. \times L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right] A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\\{} & {} \quad ={}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}B_{(m)} A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\\{} & {} \quad ={}\varvec{D}^T_{(m)}\varvec{D}_{(m)} = 1. \end{aligned}$$
By the fact that \(tr(AB)\le \lambda _{max}(A)tr(B)\) for symmetric matrix A and positive semi-definite matrix B, where \(tr(\cdot )\) denotes the trace of some matrix, we have
$$\begin{aligned}&E\left\| \zeta _{ni} \right\| ^4\nonumber \\&\quad ={}n^{-2}E\left\{ \left[ tr\left( \varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}\psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})\right. \right. \right. \nonumber \\&\qquad \left. \left. \left. \times L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)} \right) \right] ^2 \right\} \nonumber \\&\quad \le {}n^{-2}E\left\{ \left[ tr\left( \psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right. \right. \right. \nonumber \\&\qquad \left. \left. \left. \times A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1} \right) \right] ^2 \right\} \nonumber \\&\quad \le {}n^{-2}E\left\{ tr^2\left( \psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right) \right. \nonumber \\&\qquad \left. \times \left[ \lambda _{max}\left( A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1}\right) \right] ^2 \right\} \nonumber \\&\quad \le {}n^{-2}E\left\{ tr^2\left( \psi _\tau ^2\left( \epsilon _i+ a_{i(m)} \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right) \right. \nonumber \\&\qquad \left. \times \left[ \lambda _{max}\left( A_{(m)}^{-1}C_{(m)}^{-1}A_{(m)}^{-1}\right) \right] ^2\left[ \lambda _{max}\left( \varvec{D}_{(m)}\varvec{D}^T_{(m)}\right) \right] ^2 \right\} \nonumber \\&\quad \le {}n^{-2}E\left\{ tr^2\left( \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})\right) \left[ \lambda _{max}\left( B_{(m)}^{-1}\right) \right] ^2 \right\} \nonumber \\&\quad ={}O(n^{-2}K_{(m)}^2\underline{c}_{B(m)}^{-2}). \end{aligned}$$
(A9)
Then for any \(\epsilon > 0\), we have
$$\begin{aligned} \begin{aligned}&\sum _{i=1}^{n}E\left[ \left\| \zeta _{ni} \right\| ^2\varvec{1}(\left\| \zeta _{ni} \right\| \ge \epsilon ) \right] \\&\qquad \quad \le {} n\epsilon ^{-2}E\left\| \zeta _{ni} \right\| ^4 = O(n^{-1}K_{(m)}^2\underline{c}_{B(m)}^{-2}) = o(1) \end{aligned} \end{aligned}$$
by condition (C9). Therefore, \(\left\{ \zeta _{ni} \right\} \) satisfies the conditions of the Lindeberg–Feller central limit theorem and we obtain
$$\begin{aligned} \mathbb {P}_{(m)1} {\mathop {\longrightarrow }\limits ^{d}} N(0, 1). \end{aligned}$$
(A10)
Let \(\varvec{d}_{(m)} = A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\), and by the facts that \( |\varvec{r}^T(\varvec{A}^T + \varvec{A})\varvec{r} |\le \lambda _{max}(\varvec{A}^T + \varvec{A})\left\| \varvec{r} \right\| ^2\) and \(\lambda _{max}(\varvec{A}^T\varvec{A}) = \lambda _{max}(\varvec{A}\varvec{A}^T)\) for any vector \(\varvec{r}\) and square matrix \(\varvec{A}\), we have
$$\begin{aligned} \begin{aligned} \left\| \varvec{d}_{(m)}\right\| ={}&\left\| \varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1} \right\| \\ ={}&\left[ \varvec{D}^T_{(m)}C_{(m)}^{-1/2}A_{(m)}^{-1} A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}\right] ^{1/2} \\ \le {}&\left[ \lambda _{max}\left( C_{(m)}^{-1/2}A_{(m)}^{-1} A_{(m)}^{-1}C_{(m)}^{-1/2}\right) \left\| \varvec{D}_{(m)}\right\| ^2 \right] ^{1/2} \\ ={}&\lambda _{max}^{1/2} \left( A_{(m)}^{-1}C_{(m)}^{-1}A_{(m)}^{-1}\right) \\ ={}&\lambda _{max}^{1/2} \left( B_{(m)}^{-1}\right) \le \underline{c}_{B(m)}^{-1/2}. \end{aligned} \end{aligned}$$
Thus for \(\mathbb {P}_{(m)2} \) and \(\mathbb {P}_{(m)3}\), by (A8)
$$\begin{aligned} \left\| \mathbb {P}_{(m)2}\right\| = \left\| V_{(m)}(\hat{\Delta }_{(m)}) \right\| _{A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}} = o_p(1), \end{aligned}$$
(A11)
and
$$\begin{aligned} \left\| \mathbb {P}_{(m)3}\right\| = \left\| R_{(m)}\right\| _{A_{(m)}^{-1}C_{(m)}^{-1/2}\varvec{D}_{(m)}} = o_p(1). \end{aligned}$$
(A12)
Next, we show that (A6)–(A8) hold under conditions (C1)–(C9). For (A6), write \(b_{i(m)} \equiv \varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)}) = b_{i(m)}^{+} - b_{i(m)}^{-}\), where \(b_{i(m)}^{+} = \max (b_{i(m)},0)\) and \(b_{i(m)}^{-} = \max (-b_{i(m)},0)\). Thus by Minkowski’s inequality, we obtain
$$\begin{aligned}&\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}(\Delta ) - V_{(m)}(0) - \bar{V}_{(m)}(\Delta )+ \bar{V}_{(m)}(0) \right\| _{\varvec{d}_{(m)}}\nonumber \\&\quad \le {} \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}^+(\Delta ) - V_{(m)}^+(0) - \bar{V}_{(m)}^+(\Delta )+ \bar{V}_{(m)}^+(0) \right\| \nonumber \\&\qquad + \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}^-(\Delta ) - V_{(m)}^-(0) - \bar{V}_{(m)}^-(\Delta )+ \bar{V}_{(m)}^-(0) \right\| , \end{aligned}$$
(A13)
where \(V_{(m)}^+(\Delta ) = n^{-1/2}\sum _{i=1}^{n}\psi _\tau \left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left[ \varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}) + n^{-1/2}\Delta \right] \right) \)
\(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\) and \(\bar{V}_{(m)}^+(0) = EV_{(m)}^+(\Delta )\), and \(V_{(m)}^-(\Delta )\) and \(\bar{V}_{(m)}^-(\Delta ) \) are similarly defined. We need to show that each term on the right side of (A13) is \(o_p(1)\). Since the proofs of both terms are similar, we only show the first term is \(o_p(1)\).
Let \(\mathcal {H} \triangleq \left\{ \Delta \in R^{K_{(m)}}: \left\| \Delta \right\| \le K_{(m)}^{1/2}L \right\} \) for some positive \(L < \infty \). Let \( |\Delta |_{\infty }\) represent the maximum of the absolute values of the coordinates of \(\Delta \). By selecting \(N_g = (2n^2)^{K_{(m)}}\) grid points, \(\left\{ \Delta _1,\ldots , \Delta _{N_g}\right\} \), \(\mathcal {H}\) can be covered by cubes \(\mathcal {H}_s = \left\{ \Delta \in R^{K_{(m)}}: |\Delta -\Delta _s |_{\infty }\le \gamma _n \right\} \) with sides of length \(\gamma _n = LK_{(m)}^{1/2}/n^2\).
Let \(\epsilon _{i(m)}^* = Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\). By the fact that \(\psi _\tau (\cdot )\) is monotone and by Minkowski’s inequality, we can show that
$$\begin{aligned} \begin{aligned}&\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| V_{(m)}^+(\Delta ) - V_{(m)}^+(0) - \bar{V}_{(m)}^+(\Delta )+ \bar{V}_{(m)}^+(0)\right\| \\&\quad \le {} \max _{1\le s \le N_g} \left\| V_{(m)}^+(\Delta _s) - V_{(m)}^+(0) - \bar{V}_{(m)}^+(\Delta _s)+ \bar{V}_{(m)}^+(0)\right\| \\&\qquad + \max _{1\le s \le N_g}\left\| n^{-1/2}\sum _{i=1}^{n}\left\{ E\left[ \psi _{si(m)}(\gamma _n)L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \right. \right. \\&\qquad \left. \left. -E\left[ \psi _{si(m)}(-\gamma _n)L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \right\} \right\| \\&\qquad + \max _{1\le s \le N_g}\left\| n^{-1/2}\sum _{i=1}^{n}\left\{ \left[ \left( \psi _{si(m)}(\gamma _n) - \psi _{si(m)}(0)\right) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \right. \right. \\&\qquad \left. \left. -E\left[ \left( \psi _{si(m)}(\gamma _n) - \psi _{si(m)}(0)\right) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \right\} \right\| \\&\quad \triangleq {} \Omega _{(m)1} + \Omega _{(m)2} + \Omega _{(m)3}, \end{aligned} \end{aligned}$$
where \(\psi _{si(m)}(\gamma ) = \psi _\tau \left( \epsilon _{i(m)}^* - n^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta +n^{-1/2}\gamma \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| \right) \).
For \(\Omega _{(m)2}\), conditions (C5) and (C9), Taylor expansion and the fact \(b_{i(m)}^+\le |\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)}) |\le \left\| \varvec{d}_{(m)} \right\| \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| \) are used to obtain
$$\begin{aligned} \begin{aligned} \Omega _{(m)2} ={}&\max _{1\le s \le N_g}\left\| n^{1/2}\left\{ E\left[ F(-a_{i(m)}+n^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta _s+n^{-1/2}\gamma _n\right. \right. \right. \\&\left. \left. \left. \times \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| ) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ |\varvec{X}_i, \varvec{Z}_i\right] \right. \right. \\&\left. \left. -E\left[ F(-a_{i(m)}+n^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta _s-n^{-1/2}\gamma _n\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| )\right. \right. \right. \\&\left. \left. \left. \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ |\varvec{X}_i, \varvec{Z}_i\right] \right\} \right\| \\ \le {}&2c_f\gamma _nE\left[ \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+ \right] \\ \le {}&2c_f\gamma _n\left\| \varvec{d}_{(m)} \right\| E\left[ \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| ^2 L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right] \\ ={}&2c_f\gamma _n\left\| \varvec{d}_{(m)} \right\| E\left\{ tr\left[ \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^T(\varvec{X}_{i(m)})) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right] \right\} \\ \le {}&2c_f\gamma _n\left\| \varvec{d}_{(m)} \right\| K_{(m)}\frac{\bar{c}_{A(m)}}{c_f} = O(\underline{c}_{B(m)}^{-1/2}\bar{c}_{A(m)}K_{(m)}^{3/2}/n^2) = o(1). \end{aligned} \end{aligned}$$
For \(\Omega _{(m)1}\), we can see that
$$\begin{aligned}&V_{(m)}^+(\Delta _s) - V_{(m)}^+(0) - \bar{V}_{(m)}^+(\Delta _s)+ \bar{V}_{(m)}^+(0)\\&\quad ={}n^{-1}\sum _{i=1}^{n}\eta _{is(m)}\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\le h_{n0})\\&\qquad +n^{-1}\sum _{i=1}^{n}\eta _{is(m)}\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+> h_{n0})\\&\quad \triangleq {} \Omega _{(m)1,1} + \Omega _{(m)1,2}, \end{aligned}$$
where \(\eta _{is(m)} = n^{1/2}[\eta _{is(m),0} - E(\eta _{is(m),0})]\),
\(\eta _{is(m),0} {=} \left[ \psi _\tau \left( \epsilon _{i(m)}^* -n^{-1/2} \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta _s\right) {-}\psi _\tau \left( \epsilon _{i(m)}^* \right) \right] L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\), and \(h_{n0} = (nK_{(m)}^4\underline{c}_{B(m)}^{-4})^{1/8}\). We just need to prove \(\max _{1\le s \le N_g} \left\| \Omega _{(m)1,1} \right\| = o_p(1)\) and \(\max _{1\le s \le N_g} \left\| \Omega _{(m)1,2} \right\| = o_p(1)\).
For \(\Omega _{(m)1,1}\), note that
$$\begin{aligned} \begin{aligned}&Var\left( \eta _{is(m)}\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\le h_{n0}) \right) \\&\quad \le {} E\left[ \eta _{is(m)}^2\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\le h_{n0}) \right] \\&\quad \le {} nE\left[ |\psi _\tau \left( \epsilon _{i(m)}^* -n^{-1/2} \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta _s\right) \right. \\&\qquad \left. -\psi _\tau \left( \epsilon _{i(m)}^* \right) |L^2(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^{+2} \right] \\&\quad \le {} C_1n^{1/2}\underline{c}_{B(m)}^{-1}K_{(m)}^2 \end{aligned} \end{aligned}$$
for some positive \(C_1 < \infty \). By Boole’s and Bernstein’s inequality, we have
$$\begin{aligned} \begin{aligned}&P\left( \max _{1\le s \le N_g} \left\| \Omega _{(m)1,1} \right\| \ge \epsilon \right) \\&\quad \le {} N_g \max _{1\le s \le N_g} P\left( \left\| n^{-1}\sum _{i=1}^{n}\eta _{is(m)}\varvec{1}(L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+\le h_{n0}) \right\| \ge \epsilon \right) \\&\quad \le {}2 N_g\exp \left( -\frac{n\epsilon ^2}{2C_1n^{1/2}\underline{c}_{B(m)}^{-1}K_{(m)}^2+ 4\epsilon n^{1/2}h_{n0}/3} \right) \\&\quad \le {}2 \exp \left( -\frac{n\epsilon ^2}{2C_1n^{1/2}\underline{c}_{B(m)}^{-1}K_{(m)}^2+ 4\epsilon n^{1/2}h_{n0}/3} \right) \\&\quad \le {} 2 \exp (3K_{(m)}\log n) \times \exp (-4K_{(m)}\log n) \\&\quad ={} 2 \exp (-K_{(m)}\log n) = o(1), \end{aligned} \end{aligned}$$
as \(n/(\underline{c}_{B(m)}^{-1}n^{1/2}K_{(m)}^2) = n^{1/2}\underline{c}_{B(m)}K_{(m)}^{-2} \gg K_{(m)}\log n\) and \(n/(\epsilon n^{1/2}h_{n0}) = n^{3/8}K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2} \gg K_{(m)}\log n\) by condition (C9).
For \(\Omega _{(m)1,2}\), by Boole’s and Markov’s inequalities, and the fact that \(E\left[ |\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)}) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) |^8 \right] = O(K_{(m)}^4\underline{c}_{B(m)}^{-4})\) by similar arguments in (A9), the fact that \(E\left[ |b_{i(m)} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2} |^8 \right] = O(1)\), and condition (C1),
$$\begin{aligned} \begin{aligned}&P\left( \max _{1\le s \le N_g} \left\| \Omega _{(m)1,2} \right\|> \epsilon \right) \le P\left( \max _{1\le i \le n} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})b_{i(m)}^+> h_{n0} \right) \\&\quad \le {} nP\left( |\bar{b}_{i(m)} |> K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2}h_{n0} \right) \\&\quad \le {} \frac{nK_{(m)}^4\underline{c}_{B(m)}^{-4}}{h_{n0}^8}E\left[ |\bar{b}_{i(m)} |^8 \varvec{1}\left( \left|\bar{b}_{i(m)}\right|> K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2}h_{n0}\right) \right] = o(1), \end{aligned} \end{aligned}$$
where \( \bar{b}_{i(m)} = b_{i(m)} L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) K_{(m)}^{-1/2}\underline{c}_{B(m)}^{1/2}\). Thus \(\Omega _{(m)1} = o_p(1)\) has been shown. By the same technique, we can shown that \(\Omega _{(m)3} = o_p(1)\). Therefore (A6) follows.
Now, we show (A7). By conditions (C1), (C5)–(C9),
$$\begin{aligned}{} & {} \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| \bar{V}_{(m)}(\Delta )- \bar{V}_{(m)}(0) + A_{(m)}\Delta \right\| _{\varvec{d}_{(m)}}\\{} & {} \quad ={}\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| n^{-1/2}\sum _{i=1}^{n}E\left\{ \left[ F\left( -a_{i(m)}+n^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta |\varvec{X}_i,\varvec{Z}_i\right) \right. \right. \right. \\{} & {} \qquad \left. \left. \left. - F\left( -a_{i(m)} |\varvec{X}_i,\varvec{Z}_i\right) \right] \varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\} - A_{(m)}\Delta \right\| _{\varvec{d}_{(m)}} \\{} & {} \quad ={}\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L } \left\| n^{-1}\sum _{i=1}^{n}E\left\{ \int _{0}^{1}\left[ f\left( -a_{i(m)}+sn^{-1/2}\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta |\varvec{X}_i,\varvec{Z}_i\right) \right. \right. \right. \\{} & {} \qquad \left. \left. \left. - f\left( -a_{i(m)} |\varvec{X}_i,\varvec{Z}_i\right) \right] ds \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\} \right\| _{\varvec{d}_{(m)}} \\{} & {} \quad \le {} C \sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L }n^{-3/2}\sum _{i=1}^{n}E\left\| \Delta ^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\right. \\{} & {} \qquad \left. \times \Delta L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\| _{\varvec{d}_{(m)}} \\{} & {} \quad \le {} C n^{-1/2}\underline{c}_{B(m)}^{-1/2}\sup _{\left\| \Delta \right\| \le K_{(m)}^{1/2}L }\left( E\left\| \Delta ^T\varvec{B}_{(m)}(\varvec{X}_{i(m)}) L^{1/2}(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\| ^2\right) ^{1/2} \\{} & {} \qquad \times \left( E\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\Delta L^{1/2}(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right\| ^2\right) ^{1/2}\\{} & {} \quad ={}n^{-1/2}\underline{c}_{B(m)}^{-1/2}O(\bar{c}_{A(m)}^{1/2}K_{(m)}^{1/2})O(K_{(m)}^{3/2})= O(n^{-1/2}\underline{c}_{B(m)}^{-1/2}\bar{c}_{A(m)}^{1/2}K_{(m)}^{2}) = o(1). \end{aligned}$$
Finally, we show (A8). By conditions (C1)–(C7) and the proof of Lemma A2 in Ruppert and Carroll (1980) (see, Welsh (1989)), we have
$$\begin{aligned}&\left\| V_{(m)}(\hat{\Delta }_{(m)})\right\| _{\varvec{d}_{(m)}}\nonumber \\&\quad = {}\left\| n^{-1/2}\sum _{i=1}^{n}\psi _\tau \left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)}) \right) \varvec{B}_{(m)}(\varvec{X}_{i(m)})L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right\| _{\varvec{d}_{(m)}}\nonumber \\&\quad \le {}n^{-1/2}\sum _{i=1}^{n}\varvec{1}\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{z}_{(m)})=0 \right) \left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\nonumber \\&\quad \le {} n^{-1/2}K_{(m)}\max _{1\le i \le n}\left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}). \end{aligned}$$
(A14)
Since for any \(\epsilon > 0\), by Boole’s and Markov inequalities and conditions (C7)–(C9),
$$\begin{aligned} \begin{aligned}&P\left( n^{-1/2}K_{(m)}\max _{1\le i \le n}\left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})> \epsilon \right) \\&\quad \le {} n P\left( \left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) > n^{1/2}K_{(m)}^{-1}\epsilon \right) \\&\quad \le {} n^{-3}K_{(m)}^{8} E\left( \left|\varvec{d}_{(m)}^T\varvec{B}_{(m)}(\varvec{X}_{i(m)})\right|^8 L^8(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \right) \\&\quad = {} n^{-3}K_{(m)}^{8}O(K_{(m)}^4\underline{c}_{B(m)}^{-4}) = O(n^{-3}K_{(m)}^{12}\underline{c}_{B(m)}^{-4}) =o(1), \end{aligned} \end{aligned}$$
combining (A14) holds. This complete the proof of part (ii).
\(\square \)
1.2 P2. Proof of Theorem 2
Proof
We only prove (i) as the proof of (ii) is similar. Let \(\delta _{n} = L\sqrt{n^{-1}\bar{K}\log n}\) for some large positive constant \(L< \infty \). Let
$$\begin{aligned} \bar{L}_{(m)}(\varvec{\beta }_{(m)}) = E\left[ \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }_{(m)}\right) L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right] . \end{aligned}$$
Define
$$\begin{aligned} \mathcal {D}(\delta _{n}, \varvec{z}_{(m)}) \triangleq \inf _{1\le m \le M} \inf _{\left\| \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right\| >\delta _n }\left[ \bar{L}_{(m)}(\varvec{\beta }_{(m)}) - \bar{L}_{(m)}(\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})) \right] , \end{aligned}$$
(A15)
where \(\mathcal {C}(\delta _n, \varvec{z}_{(m)}) {\equiv } \left\{ \varvec{\beta }_{(m)}:\left\| \varvec{\beta }_{(m)}{-}\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right\| {>}\delta _n, \left\| \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right\| {=}o(1) \right\} \). By \(a_{i(m)} {=} \mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\), Knight’s identity and condition (C6), for any \(\varvec{\beta }_{(m)}\in \mathcal {C}(\delta _n, \varvec{z}_{(m)})\), we have
$$\begin{aligned}&\bar{L}_{(m)}(\varvec{\beta }_{(m)}) - \bar{L}_{(m)}(\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)}))\\&\quad ={} E\left\{ \left[ \rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }_{(m)}\right) -\rho _{\tau }\left( Y_i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \right] \right. \\&\qquad \left. \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)},\varvec{\lambda }_{(m)})\right\} \\&\quad ={}E\left\{ \left[ \rho _{\tau }\left( \epsilon _i + a_{i(m)} - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \right) -\rho _{\tau }\left( \epsilon _i + a_{i(m)}\right) \right] \right. \\&\qquad \left. \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right\} \\&\quad ={}E\left\{ \int _{0}^{\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) } \left[ \varvec{1}(\epsilon _i + a_{i(m)}\le s)-\varvec{1}(\epsilon _i + a_{i(m)}\le 0)\right] ds \times L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)})\right\} \\&\quad ={}E\left\{ L(\varvec{Z}_{i(m)}, \varvec{z}_{(m)}, \varvec{\lambda }_{(m)}) \int _{0}^{\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) } \left[ F( -a_{i(m)}+s|\varvec{X}_i,\varvec{Z}_i) -F( -a_{i(m)}|\varvec{X}_i,\varvec{Z}_i)\right] ds \right\} \\&\quad \simeq {} \frac{1}{2}\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) ^TA_{(m)}\left( \varvec{\beta }_{(m)}-\varvec{\beta }^*_{(m)}(\varvec{z}_{(m)})\right) \ge \frac{\underline{c}_A\delta _n^2}{2}. \end{aligned}$$
Then, by Boole’s inequality, \(\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)})\) and \(\mathcal {C}(\delta _n, \varvec{Z}_{i(m)})\) and the fact that
$$\begin{aligned}{} & {} \bar{L}_{(m)}(\hat{\varvec{\beta }}_{i(m)}) - \bar{L}_{(m)}(\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)}) \\{} & {} \qquad \simeq \frac{1}{2}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) ^TA_{(m)}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \end{aligned}$$
for \(i =1,\ldots ,n\), we have
$$\begin{aligned}&P\left( \max _{1\le i \le n}\max _{1\le m \le M} \left\| \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| \ge \delta _n\right) \nonumber \\&\quad \le {}nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \left\| \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| \ge \delta _n\right) \nonumber \\&\quad \le {}nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \bar{L}_{(m)}(\hat{\varvec{\beta }}_{i(m)}) - \bar{L}_{(m)}(\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)}) \ge \mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)}) \right) \nonumber \\&\quad \approx {}nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \mathbb {F}_{i(m)} \ge 2n\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)}) \right) \end{aligned}$$
(A16)
where \(\mathbb {F}_{i(m)} = n\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) ^TA_{(m)}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \) Thus the crucial step is to bound \(P\left( \mathbb {F}_{i(m)} \ge 2n\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)}) \right) \).
Following the proof of Theorem 1 (ii), we can also obtain that
\(\sqrt{n}\varvec{D}_{(m)}^TC_{(m)}^{-1/2}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) {\mathop {\longrightarrow }\limits ^{d}} N(0, 1)\). Write
\(\tilde{\eta }_{i(m)} \triangleq \sqrt{n}\varvec{D}_{(m)}^TC_{(m)}^{-1/2}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) {\mathop {\longrightarrow }\limits ^{d}} N(0, 1)\) and we have
\(\sqrt{n}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) = \left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^+\tilde{\eta }_{i(m)}\). Thus for \(\mathbb {F}_{i(m)}\),
$$\begin{aligned} \mathbb {F}_{i(m)} = {}&n\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) ^TA_{(m)}\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \\ = {}&\tilde{\eta }_{i(m)}^T\left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^{+T}A_{(m)}\left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^+\tilde{\eta }_{i(m)}\\ \le {}&\lambda _{max}(A_{(m)})\tilde{\eta }_{i(m)}^T\left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^{+T}\left( \varvec{D}_{(m)}^TC_{(m)}^{-1/2}\right) ^+\tilde{\eta }_{i(m)}\\ = {}&\lambda _{max}(A_{(m)})\tilde{\eta }_{i(m)}^T\left( \varvec{D}_{(m)}^TC_{(m)}^{-1}\varvec{D}_{(m)}\right) ^{-1}\tilde{\eta }_{i(m)}\\ \le {}&\lambda _{max}(A_{(m)})\lambda _{max}(C_{(m)})\left\| \tilde{\eta }_{i(m)}\right\| ^2 \le \frac{\bar{c}_{A}\bar{c}_{B}}{\underline{c}_{A}^2}\left\| \tilde{\eta }_{i(m)}\right\| ^2 \end{aligned}$$
where we have used the fact that \(A^{+T}A^+ = (AA^T)^+\) ((Bernstein 2005) Proposition 6.1.6xvii) by conditions (C6) - (C7). Let \(c_{AB} = \frac{\bar{c}_{A}\bar{c}_{B}}{\underline{c}_{A}^2}\), then according to the Lemma 2.1 of Shibata (1981), we have
$$\begin{aligned}&nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \mathbb {F}_{i(m)} \ge 2n\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)}) \right) \nonumber \\&\quad \le {} nM\max _{1\le i \le n}\max _{1\le m \le M}P\left( \left\| \tilde{\eta }_{i(m)}\right\| ^2 \ge 2n\mathcal {D}(\delta _{n}, \varvec{Z}_{i(m)})/c_{AB} \right) \nonumber \\&\quad \le {} \limsup _{n\rightarrow \infty }nM \max _{1\le m \le M}P\left( \chi ^2(1) \ge 1 + \left[ n\underline{c}_A\delta _n^2/c_{AB}-1\right] \right) \nonumber \\&\quad \le {} \limsup _{n\rightarrow \infty } nM \exp \left( -\frac{\left[ n\underline{c}_A\delta _n^2/c_{AB}-1\right] }{2} \times \left[ 1- \frac{\log ( n\underline{c}_A\delta _n^2/c_{AB})}{ n\underline{c}_A\delta _n^2/c_{AB} -1}\right] \right) =0 \end{aligned}$$
(A17)
as \(nM\exp \left( -0.5 n\underline{c}_A\delta _n^2/c_{AB}\right) = nMn^{-0.5L^2\bar{K}\underline{c}_{A}^3/(\bar{c}_{A} \bar{c}_{B})} = o(1) \) for large enough L and \(\left[ \log ( n\underline{c}_A\delta _n^2/c_{AB})\right] / \left[ n\underline{c}_A\delta _n^2/c_{AB} -1\right] = o(1)\) under our conditions. Thus under (A16)–(A17), (i) follows. \(\square \)
1.3 P3. Proof of Theorem 3
Proof
Following Li (1987) and Lu and Su (2015), it suffices to show that
$$\begin{aligned} \sup _{\varvec{w}\in \mathcal {W}}\left|\frac{CV_n(\varvec{w})-FPE_n(\varvec{w})}{FPE_n(\varvec{w})} \right|= o_p(1). \end{aligned}$$
(A18)
Let \(E_i\) and \(E_{x,z}\) be the expectations with respect to \((\varvec{X}_i, \varvec{Z}_i)\) and \((\varvec{X}, \varvec{Z})\), respectively. By the fact that
$$\begin{aligned} \begin{aligned}&E\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\mu }\left[ F(s|\varvec{X},\varvec{Z}) -F(0|\varvec{X},\varvec{Z}) \right] ds |\mathcal {A}_n \right\} \\&\quad ={} E_i\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{i(m)})-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \end{aligned} \end{aligned}$$
and Knight’s identity, we have
$$\begin{aligned}&CV_n(\varvec{w})-FPE_n(\varvec{w}) \nonumber \\&\quad ={} \left\{ \frac{1}{n}\sum _{i=1}^{n}\left[ \rho _\tau \left( Y_i-\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)} \right) -\rho _\tau (\epsilon _i) \right] \right\} \nonumber \\&\qquad -\left\{ FPE_n(\varvec{w}) - E[\rho _\tau (\epsilon )]\right\} + \frac{1}{n}\sum _{i=1}^{n}\left\{ \rho _\tau (\epsilon _i)-E[\rho _\tau (\epsilon )] \right\} \nonumber \\&\quad ={} \frac{1}{n}\sum _{i=1}^{n}\left[ \mu _i -\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)} \right] \psi (\epsilon _i) \nonumber \\&\qquad + \frac{1}{n}\sum _{i=1}^{n} \int _{0}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ \varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0) \right] ds \nonumber \\&\qquad -E\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\mu }\left[ \varvec{1}(\epsilon \le s) -\varvec{1}(\epsilon \le 0) \right] ds |\mathcal {A}_n \right\} \nonumber \\&\qquad +\frac{1}{n}\sum _{i=1}^{n}\left\{ \rho _\tau (\epsilon _i)-E[\rho _\tau (\epsilon )] \right\} \nonumber \\&\quad \triangleq {} CV_{n1}(\varvec{w}) + CV_{n2}(\varvec{w}) + CV_{n3}(\varvec{w}) + CV_{n4}(\varvec{w}) + CV_{n5}(\varvec{w}) \end{aligned}$$
(A19)
where
$$\begin{aligned} CV_{n1}(\varvec{w}) \triangleq&\,\frac{1}{n}\sum _{i=1}^{n}\left[ \mu _i -\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)} \right] \psi (\epsilon _i),\\ CV_{n2}(\varvec{w}) \triangleq {}&\frac{1}{n}\sum _{i=1}^{n} \int _{0}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ \varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0) \right. \\&\left. - F(s|\varvec{X}_i, \varvec{Z}_i) + F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds,\\ CV_{n3}(\varvec{w}) \triangleq {}&\frac{1}{n}\sum _{i=1}^{n}\left[ \int _{0}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ F(s|\varvec{X}_i, \varvec{Z}_i) - F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \right. \\&\left. - E_i\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \right] ,\\ CV_{n4}(\varvec{w}) \triangleq {}&\frac{1}{n}\sum _{i=1}^{n} E_i\left[ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right. \\&\left. - E_i\left\{ \int _{0}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{i(m)})-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \right] , \end{aligned}$$
and
$$\begin{aligned} \begin{aligned} CV_{n5}(\varvec{w}) \triangleq \frac{1}{n}\sum _{i=1}^{n}\left\{ \rho _\tau (\epsilon _i)-E[\rho _\tau (\epsilon )] \right\} . \end{aligned} \end{aligned}$$
In order to prove (A18), we need to show that (i) \(\min _{\varvec{w}\in \mathcal {W}}FPE_n(\varvec{w}) \ge E[\rho _\tau (\epsilon )] - o_p(1)\); (ii) \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n1}(\varvec{w})|= o_p(1)\); (iii) \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n2}(\varvec{w})|= o_p(1)\); (iv) \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n3}(\varvec{w})|= o_p(1)\); (v) \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n4}(\varvec{w})|= o_p(1)\); (vi) \(|CV_{n5}(\varvec{w})|= o_p(1)\). (vi) follows by the weak law of large numbers so we just show (i)–(v).
We first prove (i). Let \(u(\varvec{w}) \triangleq \mu -\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\).
By Knight’s identity, Taylor expansion, Jensen inequality, conditions (C5), (C6) and (C9), Theorem 2, and the fact that \(E [\psi _\tau (\epsilon +u(\varvec{w}))] = 0\) by the first order condition for the population minimization problem (10), we have
$$\begin{aligned}&FPE_n(\varvec{w}) - E[\rho _\tau (\epsilon +u(\varvec{w}))] \\&\quad ={} E\left\{ \rho _\tau \left( \epsilon +u(\varvec{w}) - \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) \right) \right. \\&\qquad \left. - \rho _\tau (\epsilon +u(\varvec{w}))|\mathcal {A}_n \right\} \\&\quad ={} E\left\{ \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) }\varvec{1}\left( \epsilon +u(\varvec{w}) \le s \right) \right. \\&\qquad \left. - \varvec{1}(\epsilon +u(\varvec{w})\le 0) ds|\mathcal {D}_n \right\} \\&\quad ={} E_{x,z}\left\{ \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) } F\left( s-u(\varvec{w})|\varvec{X},\varvec{Z} \right) \right. \\&\qquad \left. - F\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) ds \right\} \\&\quad ={} E_{x,z}\left\{ \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) } f\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) s ds \right\} + o_p(1)\\&\quad ={} \frac{1}{2}E_{x,z}\left\{ f\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) \left[ \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) \right] ^2\right\} + o_p(1)\\&\quad \le {} \frac{1}{2}E_{x,z}\left\{ f\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) \sum _{m=1}^{M}w_m \left[ \varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})\left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) \right] ^2\right\} + o_p(1)\\&\quad \le {} \frac{1}{2}\left\{ \sum _{m=1}^{M}w_m \left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) ^T\right. \\&\quad E[f\left( -u(\varvec{w})|\varvec{X},\varvec{Z} \right) \varvec{B}_{(m)}(\varvec{X}_{(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{(m)})]\\&\qquad \left. \times \left( \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right) \right\} + o_p(1)\\&\quad \le {} \frac{\bar{c}_A}{2}\max _{1\le m \le M}\left\| \hat{\varvec{\beta }}_{(m)}(\varvec{Z}_{(m)})-\varvec{\beta }^*_{(m)}(\varvec{Z}_{(m)})\right\| ^2 + o_p(1)\\&\quad ={}o_p(1). \end{aligned}$$
Define \(H(t) \triangleq E[\rho _\tau (\epsilon + t)- \rho _\tau (\epsilon )]\) where \(t \in R\). It can be seen that H(t) has a global minimum at \(t = 0\). We obtain that \(\min _{\varvec{w}\in \mathcal {W}} E[\rho _\tau (\epsilon +u(\varvec{w}))] \ge E[\rho _\tau (\epsilon )]\). As a result, we have
$$\begin{aligned} \begin{aligned} \min _{\varvec{w}\in \mathcal {W}}FPE_n(\varvec{w}) ={}&\min _{\varvec{w}\in \mathcal {W}} E[\rho _\tau (\epsilon +u(\varvec{w}))] - o_p(1)\\ \ge {}&E[\rho _\tau (\epsilon )] - o_p(1). \end{aligned} \end{aligned}$$
(ii) For \(CV_{n1}(\varvec{w})\), we decompose it as follows
$$\begin{aligned} \begin{aligned} CV_{n1}(\varvec{w}) ={}&\frac{1}{n}\sum _{i=1}^{n}\left[ \mu _i -\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right] \psi (\epsilon _i)\\&- \frac{1}{n}\sum _{i=1}^{n}\left[ \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right] \psi (\epsilon _i)\\ \triangleq {}&CV_{n1,1}(\varvec{w}) - CV_{n1,2}(\varvec{w}) \end{aligned} \end{aligned}$$
It can be seen that \(E[CV_{n1,1}(\varvec{w})] = 0\), and \(Var[CV_{n1,1}(\varvec{w})] = O(\bar{K}/n)\). We have \(CV_{n1,1}(\varvec{w}) = o_p(1)\) for each \(\varvec{w} \in \mathcal {W}\). If both M and \(\bar{K} = \max _{1\le m \le M}K_m\) are finite by condition (C7), by Glivenko-Cantelli theorem (Theorem 2.4.1 in Van der Vaart and Wellner (1996)), we have \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n1,1}(\varvec{w})|= o_p(1)\). To be specific, consider the class of functions \(\mathcal {F} = \left\{ f(\cdot ,\cdot ,\cdot ;\varvec{w}): \varvec{w}\in \mathcal {W}\right\} \), where \( f(\cdot ,\cdot ,\cdot ;\varvec{w}): R\times \mathcal {S}_{X}\times \mathcal {S}_{Z} \rightarrow R\) is defined by
$$\begin{aligned} f(\epsilon _i,\varvec{X}_{i},\varvec{Z}_{i};\varvec{w}) = \left[ \mu _i -\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)}) \right] \psi (\epsilon _i). \end{aligned}$$
We define the metrics \(|\cdot |_1\) on \(\mathcal {W}\), where \(|\varvec{w}_1-\varvec{w}_2|_1 = \sum _{m=1}^{M}|w_{m1}-w_{m2}|\) for any \(\varvec{w}_1 = (w_{11},w_{21},\ldots ,w_{M1})\) and \(\varvec{w}_2 = (w_{12},w_{22},\ldots ,w_{M2})\) \(\in \mathcal {W}\). We can easily obtain that the \(\epsilon -\)covering number of \(\mathcal {W}\) with respect to \(|\cdot |_1\) is given by \(N(\epsilon ) = O(1/\epsilon ^{M-1})\). According to the fact that
$$\begin{aligned} \begin{aligned}&|f(\epsilon _i,\varvec{X}_{i},\varvec{Z}_{i};\varvec{w}_1) - f(\epsilon _i,\varvec{X}_{i},\varvec{Z}_{i};\varvec{w}_2) |\\&\quad ={}\left|\sum _{m=1}^{M}(w_{m1} - w_{m2}) \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\psi (\epsilon _i) \right|\\&\quad \le {} c_{\beta }\left|\varvec{w}_1 - \varvec{w}_2\right|_1 \max _{1\le m \le M}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| , \end{aligned} \end{aligned}$$
where \(c_{\beta } = \max _{1\le m \le M}\left\| \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| = O(\bar{K}^{1/2})\) and that \(\max _{1\le m \le M}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| <\infty \) in the case of finite M and \(\bar{K}\), together with Theorem 2.7.11 in Van der Vaart and Wellner (1996), we have the \(\epsilon \)-bracketing num- ber of \(\mathcal {F}\) with respect to the \(L_1(P)-\)norm is given by \(N(\epsilon , L_1(P))\le C/\epsilon ^{M-1}\) for some finite C. Thus, we have \(\mathcal {F}\) is Glivenko–Cantelli by Theorem 2.4.1 in Van der Vaart and Wellner (1996).
When either \(M \rightarrow \infty \) or \(\bar{K} \rightarrow \infty \) as \(n \rightarrow \infty \), the above proof is invalid. To allow for diverging M and \(\bar{K}\), let \(r_n = (\bar{K}\log n)^{-1}\). We build grids using regions of the form \(\textbf{W}_j = \left\{ \varvec{w}:|\varvec{w} - \varvec{w}_j|_1 \le r_n\right\} \). By choosing \(\varvec{w}_j = (w_{1j},\ldots ,w_{Mj})^T\), \(\mathcal {W}\) can be covered with \(N_{w} = O(1/h_n^{M-1})\) regions \(\textbf{W}_j\), \(j = 1,\ldots , N_{w}\). It can be seen that
$$\begin{aligned} \begin{aligned}&\sup _{\varvec{w}\in \textbf{W}_j}\left|CV_{n1,1}(\varvec{w}) - CV_{n1,1}(\varvec{w}_j)\right|\\&\quad ={}\sup _{\varvec{w}\in \textbf{W}_j}\left|\sum _{m=1}^{M}(w_{m} - w_{mj})\frac{1}{n}\sum _{i=1}^{n} \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\psi (\epsilon _i) \right|\\&\quad \le {}O_p(\bar{K}) \sup _{\varvec{w}\in \textbf{W}_j}\sum _{m=1}^{M}|w_{m} - w_{mj}|\\&\quad \le {}O_p(\bar{K})r_n = o_p(1), \end{aligned} \end{aligned}$$
where the result holds uniformly in \(\textbf{W}_j\). Therefore, we have
$$\begin{aligned} \begin{aligned}&\sup _{\varvec{w}\in \mathcal {W}}|CV_{n1,1}(\varvec{w})|\\&\quad ={} \max _{1\le j \le N_{w}} \sup _{\varvec{w}\in \textbf{W}_j}|CV_{n1,1}(\varvec{w})|\\&\quad \le {} \max _{1\le j \le N_{w}} |CV_{n1,1}(\varvec{w}_j)|+\max _{1\le j \le N_{w}} \sup _{\varvec{w}\in \textbf{W}_j}|CV_{n1,1}(\varvec{w})-CV_{n1,1}(\varvec{w}_j)|\\&\quad ={} \max _{1\le j \le N_{w}} |CV_{n1,1}(\varvec{w}_j)|+ o_p(1). \end{aligned} \end{aligned}$$
We are going to use the fact that
$$\begin{aligned} \begin{aligned}&\max _{1\le m \le M}\frac{1}{n}\sum _{i=1}^{n}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| \\&\quad \le {}\max _{1\le m \le M}\frac{1}{n}\sum _{i=1}^{n}E\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| \\&\qquad + \max _{1\le m \le M}\left|\frac{1}{n}\sum _{i=1}^{n}\left( \left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| - E\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)}) \right\| \right) \right|\\&\quad ={}O(\bar{K}^{1/2}) + o_p(1) = O_p(\bar{K}^{1/2}). \end{aligned} \end{aligned}$$
Let \(h_n = (Mn\bar{K}^2)^{1/2}\). For \(CV_{n1,1}(\varvec{w}_j)\), we have \(\max _{1\le j \le N_{w}} |CV_{n1,1}(\varvec{w}_j)|\le \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)} \right|\), where \(u_{i(m)} = \left( \mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \psi _\tau (\epsilon _i)\) and \(E(u_{i(m)}) = 0\). Thus for any \(\epsilon > 0\), we have
$$\begin{aligned} \begin{aligned}&P\left( \max _{1\le j \le N_{w}} |CV_{n1,1}(\varvec{w}_j)|\ge 2\epsilon \right) \\&\quad \le {} P\left( \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)} \right|\ge 2\epsilon \right) \\&\quad \le {} P\left( \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)}\varvec{1}(|u_{i(m)}|\le h_n) \right|\ge \epsilon \right) \\&\qquad + P\left( \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)}\varvec{1}(|u_{i(m)}|> h_n) \right|\ge \epsilon \right) \\&\quad \triangleq {} R_{n1} + R_{n2}. \end{aligned} \end{aligned}$$
Since by the fact that \(A^TBA \le \lambda _{max}(B)A^TA\) for any real symmetric matrix B and suitable A,
$$\begin{aligned} \begin{aligned}&Var(u_{i(m)}\varvec{1}(|u_{i(m)}|\le h_n))\le E(u_{i(m)}\varvec{1}(|u_{i(m)}|\le h_n))^2\\&\quad \le {} 2E(\mu _i^2\psi _\tau ^2(\epsilon _i)) + 2\varvec{\beta }^{*T}_{(m)}(\varvec{Z}_{i(m)})E[\varvec{B}_{(m)}(\varvec{X}_{i(m)}) \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})]\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)}) \le c_{\bar{K}}\bar{K} \end{aligned} \end{aligned}$$
for some positive constant \(c_{\bar{K}} < \infty \), by Bernstein’s and Boole’s inequalities, and \(n^{-1}\bar{K}^2M(\log M)^2 = o_p(1)\), we have
$$\begin{aligned} R_{n1}\le & {} M\max _{1\le m \le M}P\left( \left|\sum _{i=1}^{n} u_{i(m)}\varvec{1}(|u_{i(m)}|\le h_n) \right|\ge n\epsilon \right) \\\le & {} 2M\exp \left( -\frac{n\epsilon ^2}{2c_{\bar{K}}\bar{K}+2\epsilon h_n/3} \right) \\\le & {} 2\exp \left( -\frac{n\epsilon ^2}{2c_{\bar{K}}\bar{K}+2\epsilon h_n/3} + \log M \right) = o(1). \end{aligned}$$
Similarly, by Markov’s and Boole’s inequalities and condition (C1), we have
$$\begin{aligned} R_{n2} \le {}&P\left( \max _{1\le m \le M} \left|\frac{1}{n}\sum _{i=1}^{n} u_{i(m)}\varvec{1}(|u_{i(m)}|> h_n) \right|\ge \epsilon \right) \\ \le {}&P\left( \max _{1\le m \le M} \frac{1}{n}\sum _{i=1}^{n} \left|u_{i(m)} \right|\varvec{1}(|u_{i(m)}|> h_n) \ge \epsilon \right) \\ \le {}&P\left( \max _{1\le m \le M} \max _{1\le i \le n} |u_{i(m)}|> h_n \right) \\ \le {}&\sum _{i=1}^{n}\sum _{m=1}^{M}P\left( \left|\left( \mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \psi _\tau (\epsilon _i)\right|> h_n \right) \\ \le {}&\frac{1}{h_n^4}\sum _{i=1}^{n}\sum _{m=1}^{M}E\left[ \left|\mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right|^4 \psi _\tau ^4(\epsilon _i)\right. \\&\left. \times \varvec{1}\left( \left|\mu _i - \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right|^4 \psi _\tau ^4(\epsilon _i) > h_n^4 \right) \right] \\ ={}&o(1). \end{aligned}$$
Thus we have \(\max _{1\le j \le N_{w}}|CV_{n1,1}(\varvec{w}_j)|= o_p(1)\) and \(\sup _{\varvec{w}\in \mathcal {W}}|CV_{n1,1}(\varvec{w})|= o_p(1)\).
By the triangle inequality and condition (C9), we have
$$\begin{aligned} \begin{aligned}&\sup _{\varvec{w}\in \mathcal {W}} |CV_{n1,2}(\varvec{w})|\\&\quad \le {} \sup _{\varvec{w}\in \mathcal {W}}\sum _{m=1}^{M}w_m\frac{1}{n}\sum _{i=1}^{n}\left|\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \psi (\epsilon _i)\right|\\&\quad \le {} \max _{1\le i \le n}\max _{1\le m \le M}\left\| \hat{\varvec{\beta }}_{i(m)}-\varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| \max _{1\le m \le M} \frac{1}{n}\sum _{i=1}^{n}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| \\&\quad ={}O_p\left( \sqrt{n^{-1}\bar{K}\log n}\right) O_p(\bar{K}^{1/2}) = o_p(1). \end{aligned} \end{aligned}$$
Thus \(\sup _{\varvec{w}\in \mathcal {W}} |CV_{n1,2}(\varvec{w})|= o_p(1)\).
(iii) For \(CV_{n2}(\varvec{w})\), we can see that
$$\begin{aligned}{} & {} CV_{n2}(\varvec{w})\\{} & {} \quad = {} \frac{1}{n}\sum _{i=1}^{n} \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i} \left[ \varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0) \right. \\{} & {} \qquad \left. - F(s|\varvec{X}_i, \varvec{Z}_i) + F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \\{} & {} \qquad + \frac{1}{n}\sum _{i=1}^{n} \int _{ \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ \varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0) \right. \\{} & {} \qquad \left. - F(s|\varvec{X}_i, \varvec{Z}_i) + F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \\{} & {} \quad ={} CV_{n2,1}(\varvec{w})+ CV_{n2,2}(\varvec{w}) \end{aligned}$$
Observing that \(E[CV_{n2,1}(\varvec{w})] = 0\) and \(Var[CV_{n2,1}(\varvec{w})] = O(\bar{K}/n)\), we have \(CV_{n2,1}(\varvec{w}) = o_p(1)\) for each \(\varvec{w} \in \mathcal {W}\). Similar to the proof of \(CV_{n1,1}(\varvec{w})\), \(\sup _{\varvec{w}\in \mathcal {W}} CV_{n2,1}(\varvec{w}) = o_p(1)\).
Considering the fact that \(\left|\varvec{1}(\epsilon _i\le s) -\varvec{1}(\epsilon _i\le 0)- F(s|\varvec{X}_i, \varvec{Z}_i) + F(0|\varvec{X}_i, \varvec{Z}_i) \right|\le 2\), we have
$$\begin{aligned}{} & {} CV_{n2,2}(\varvec{w})\nonumber \\{} & {} \quad \le {}\frac{2}{n}\sum _{i=1}^{n} \left|\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\nonumber \\{} & {} \quad \le {}2\max _{1\le i \le n}\max _{1\le m \le M}\left\| \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| \frac{1}{n}\sum _{i=1}^{n}\left\| \varvec{B}_{(m)}(\varvec{X}_{i(m)})\right\| \nonumber \\{} & {} \quad ={}O_p\left( \sqrt{n^{-1}\bar{K}\log n}\right) O_p(\bar{K}^{1/2}) = o_p(1). \end{aligned}$$
(iv) For \(CV_{n3}(\varvec{w})\), we have
$$\begin{aligned} CV_{n3}(\varvec{w})= & {} \frac{1}{n}\sum _{i=1}^{n}\left[ \int _{0}^{ \sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i} \left[ F(s|\varvec{X}_i, \varvec{Z}_i) - F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \right. \\{} & {} \left. - E_i\left\{ \int _{0}^{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \right] \\{} & {} + \frac{1}{n}\sum _{i=1}^{n}\left[ \int _{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i}^{ \sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i} \left[ F(s|\varvec{X}_i, \varvec{Z}_i) - F(0|\varvec{X}_i, \varvec{Z}_i) \right] ds \right. \\{} & {} \left. - E_i\left\{ \int _{\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)}) \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})-\mu _i}^{\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\hat{\varvec{\beta }}_{i(m)}-\mu _i}\left[ F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i) \right] ds \right\} \right] \\{} & {} \triangleq {} CV_{n3,1}(\varvec{w}) + CV_{n3,2}(\varvec{w}) \end{aligned}$$
The proof of \(\sup _{\varvec{w}\in \mathcal {W}} |CV_{n3,1}(\varvec{w})|= o_p(1)\) is similar to that of \(\sup _{\varvec{w}\in \mathcal {W}} |CV_{n1,1}(\varvec{w})|= o_p(1)\). Since \(|F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i)|\le 1\), we have
$$\begin{aligned}{} & {} |CV_{n3,2}(\varvec{w})|\\{} & {} \quad \le {} \frac{1}{n}\sum _{i=1}^{n} \left|\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\\{} & {} \qquad +\frac{1}{n}\sum _{i=1}^{n} E_i \left|\sum _{m=1}^{M}w_m \varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\\{} & {} \quad \triangleq {} CV_{n3,21}(\varvec{w}) + CV_{n3,22}(\varvec{w}). \end{aligned}$$
\(CV_{n3,21}(\varvec{w})\) is similar to \(CV_{n2,2}(\varvec{w})\). For \(CV_{n3,22}(\varvec{w})\), by the Cauchy-Schwarz and triangle inequalities, Theorem 1, and the fact that \(A^TBA\le \lambda _{max}(B)A^TA\) for any real symmetric matrix B, we can see that
$$\begin{aligned}&\sup _{\varvec{w}\in \mathcal {W}} CV_{n3,22}(\varvec{w})\\&\quad \le {} \sup _{\varvec{w}\in \mathcal {W}}\frac{1}{n}\sum _{i=1}^{n} \sum _{m=1}^{M}w_mE_i \left|\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\\&\quad \le {} \sup _{\varvec{w}\in \mathcal {W}}\frac{1}{n}\sum _{i=1}^{n} \sum _{m=1}^{M}w_m \left[ \left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) ^T E_i\left( \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\right) \right. \\&\qquad \left. \times \left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right] ^{1/2} \\&\quad \le {} \max _{1\le m \le M} \left[ \lambda _{max}\left( E_i\left( \varvec{B}_{(m)}(\varvec{X}_{i(m)})\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\right) \right) \right] ^{1/2} \\&\qquad \times \max _{1\le i \le n}\max _{1\le m \le M} \left\| \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right\| =o_p(1). \end{aligned}$$
Thus \(\sup _{\varvec{w}\in \mathcal {W}} |CV_{n3,2}(\varvec{w})|= o_p(1)\).
(v) For \(CV_{n4}(\varvec{w})\), since \(|F(s|\varvec{X}_i,\varvec{Z}_i) -F(0|\varvec{X}_i,\varvec{Z}_i)|\le 1\), and referring to the proof of \(CV_{n3,22}(\varvec{w})\), we have
$$\begin{aligned} \begin{aligned}&\sup _{\varvec{w}\in \mathcal {W}}CV_{n4}(\varvec{w}) \\&\quad \le {} \sup _{\varvec{w}\in \mathcal {W}}\frac{1}{n}\sum _{i=1}^{n} E_i\left|\sum _{m=1}^{M}w_m\varvec{B}_{(m)}^{T}(\varvec{X}_{i(m)})\left( \hat{\varvec{\beta }}_{i(m)}- \varvec{\beta }^*_{(m)}(\varvec{Z}_{i(m)})\right) \right|\\&\quad ={} o_p(1). \end{aligned} \end{aligned}$$
Thus the theorem has been proved. \(\square \)