General rank-based estimation for regression single index models

Bindele, Huybrechts F.; Abebe, Ash; Meyer, Karlene N.

doi:10.1007/s10463-017-0618-9

General rank-based estimation for regression single index models

Published: 20 September 2017

Volume 70, pages 1115–1146, (2018)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Huybrechts F. Bindele¹,
Ash Abebe² &
Karlene N. Meyer³

509 Accesses
12 Citations
Explore all metrics

Abstract

This study considers rank estimation of the regression coefficients of the single index regression model. Conditions needed for the consistency and asymptotic normality of the proposed estimator are established. Monte Carlo simulation experiments demonstrate the robustness and efficiency of the proposed estimator compared to the semiparametric least squares estimator. A real-life example illustrates that the rank regression procedure effectively corrects model nonlinearity even in the presence of outliers in the response space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust adaptive model selection and estimation for partial linear varying coefficient models in rank regression

Article 12 October 2017

Rank method for partial functional linear regression models

Article 30 June 2020

Aligned rank tests in measurement error model

Article 02 February 2016

References

Abebe, A., McKean, J. W. (2013). Weighted Wilcoxon estimators in nonlinear regression. Australian and New Zealand Journal of Statistics, 55(4), 401–420.
Andrews, D. W. K. (1994). Asymptotics for semiparametric econometric models via stochastic equicontinuity. Econometrica, 62(1), 43–72.
Article MathSciNet MATH Google Scholar
Bindele, H. F., Abebe, A. (2012). Bounded influence nonlinear signed-rank regression. Canadian Journal of Statistics, 40(1), 172–189.
Carroll, R. J., Fan, J., Gijbels, I., Wand, M. P. (1997). Generalized partially linear single-index models. Journal of the American Statistical Association, 92(438), 477–489.
Chang, W. H., McKean, J. W., Naranjo, J. D., Sheather, S. J. (1999). High-breakdown rank regression. Journal of the American Statistical Association, 94(445), 205–219.
Delecroix, M., Härdle, W., Hristache, M. (2003). Efficient estimation in conditional single-index regression. Journal of Multivariate Analysis, 86(2), 213–226.
Delecroix, M., Hristache, M., Patilea, V. (2006). On semiparametric-estimation in single-index regression. Journal of Statistical Planning and Inference, 136(3), 730–769.
Feng, L., Zou, C., Wang, Z. (2012). Rank-based inference for the single-index model. Statistics & Probability Letters, 82(3), 535–541.
Hájek, J., Šidák, Z., Sen, P. K. (1999). Theory of rank tests. Probability and mathematical statistics (2nd ed.). San Diego, CA: Academic Press, Inc.
Han, A. K. (1987). Non-parametric analysis of a generalized regression model: The maximum rank correlation estimator. Journal of Econometrics, 35(2), 303–316.
Article MathSciNet MATH Google Scholar
Härdle, W., Stoker, T. M. (1989). Investigating smooth multiple regression by the method of average derivatives. Journal of the American Statistical Association, 84(408), 986–995.
Härdle, W., Tsybakov, A. B. (1993). How sensitive are average derivatives? Journal of Econometrics, 58(1–2), 31–48.
Härdle, W., Hall, P., Ichimura, H. (1993). Optimal smoothing in single-index models. The Annals of Statistics, 21(1), 157–178.
Hettmansperger, T. P., McKean, J. W. (1998). Robust nonparametric statistical methods, volume 5 of Kendall’s library of statistics. London: Edward Arnold.
Hettmansperger, T. P., McKean, J. W. (2011). Robust nonparametric statistical methods, volume 119 of monographs on statistics and applied probability (2nd ed.). Boca Raton, FL: CRC Press.
Horowitz, J. L., Härdle, W. (1996). Direct semiparametric estimation of single-index models with discrete covariates. Journal of the American Statistical Association, 91(436), 1632–1640.
Hristache, M., Juditsky, A., Spokoiny, V. (2001). Direct estimation of the index coefficient in a single-index model. The Annals of Statistics, 29(3), 595–623.
Ichimura, H. (1993). Semiparametric least squares (sls) and weighted sls estimation of single-index models. Journal of Econometrics, 58(1–2), 71–120.
Article MathSciNet MATH Google Scholar
Jaeckel, L. A. (1972). Estimating regression coefficients by minimizing the dispersion of the residuals. Annals of Mathematical Statistics, 43, 1449–1458.
Article MathSciNet MATH Google Scholar
Klein, R. W., Spady, R. H. (1993). An efficient semiparametric estimator for binary response models. Econometrica, 61(2), 387–421.
Kutner, M., Nachtsheim, C., Neter, J., Li, W. (2004). Applied Linear Statistical Models with Student CD. New York: McGraw-Hill/Irwin Companies, Incorporated.
Liu, J., Zhang, R., Zhao, W., Lv, Y. (2013). A robust and efficient estimation method for single index models. Journal of Multivariate Analysis, 122, 226–238.
McCullagh, P., Nelder, J. A. (1989). Generalized linear models (2nd ed.). London: Chapman & Hall.
Naranjo, J. D., Hettmansperger, T. P. (1994). Bounded influence rank regression. Journal of the Royal Statistical Society. Series B, 56(1), 209–220.
Newey, W. K. (2004). Efficient semiparametric estimation via moment restrictions. Econometrica, 72(6), 1877–1897.
Article MathSciNet MATH Google Scholar
Newey, W. K., McFadden, D. (1994). Large sample estimation and hypothesis testing. Handbook of econometrics, 4, 2111–2245.
Powell, J. L., Stock, J. H., Stoker, T. M. (1989). Semiparametric estimation of index coefficients. Econometrica, 57(6), 1403–1430.
R Development Core Team. (2009). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0.
Rao, T. S., Das, S., Boshnakov, G. N. (2014). A frequency domain approach for the estimation of parameters of spatio-temporal stationary random processes. Journal of Time Series Analysis, 35(4), 357–377.
Serfling, R. J. (1980). Approximation theorems of mathematical statistics., Wiley series in probability and mathematical statistics. New York: Wiley.
Book MATH Google Scholar
Sherman, R. P. (1994). Maximal inequalities for degenerate U-processes with applications to optimization estimators. The Annals of Statistics, 22(1), 439–459.
Article MathSciNet MATH Google Scholar
Whitt, W. (2011). Stochastic-process limits: An introduction to stochastic-process limits and their application to queues. New York: Springer.
Xia, Y. (2006). Asymptotic distribution for two estimators of the single-index model. Econometric Theory, 22, 1112–1137.
Article MathSciNet MATH Google Scholar
Xia, Y., Tong, H., Li, W. K., Zhu, L.-X. (2002). An adaptive estimation of dimension reduction space. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 363–410.
Xiang, X. (1995). A strong law of large number for $L$-statistics in the non-i.d. case. Communications in Statistics. Theory and Methods, 24(7), 1813–1819.
Article MathSciNet MATH Google Scholar
Yin, X., Cook, R. D. (2005). Direction estimation in single-index regressions. Biometrika, 92(2), 371–384.
Yu, Y., Ruppert, D. (2002). Penalized spline estimation for partially linear single-index models. Journal of the American Statistical Association, 97(460), 1042–1054.

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of South Alabama, 411 University Blvd. N, ILB 316, Mobile, AL, 36688-0002, USA
Huybrechts F. Bindele
Department of Mathematics and Statistics, Auburn University, Auburn, AL, 36849, USA
Ash Abebe
Department of Mathematics and Statistics, Georgetown University, Washington, DC, 20057, USA
Karlene N. Meyer

Authors

Huybrechts F. Bindele
View author publications
You can also search for this author in PubMed Google Scholar
Ash Abebe
View author publications
You can also search for this author in PubMed Google Scholar
Karlene N. Meyer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huybrechts F. Bindele.

Appendix

This appendix contains proofs of the theoretical main results together with a key Lemma due to Delecroix et al. (2006) that ensures the uniform strong consistency of the leave-one-out Nadaraya–Watson estimator. For details regarding the proof of this lemma, readers are referred to the aforementioned paper.

Lemma 3

Let $\mathscr {B}_{n}=:\{\mathbf {\beta }:~\Vert \mathbf {\beta }-\mathbf {\beta }_{0}\Vert \le d_{n}\}$, where $d_{n}$ is some sequence decreasing to zero. Then,

(a)
if $\delta >0$, we have,
$$\begin{aligned} \sup _{\mathbf {\beta }\in \mathscr {B}_{n},h\in \mathscr {H}_{n}}\left| I_{\{{\mathbf x}:\widehat{\mu }_{\mathbf {\beta },h}^{i}({\mathbf x}^{\tau }\mathbf {\beta })\ge c\}}({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i})\right| \le I_{\varGamma ^{\delta }}({\mathbf X}_{i})+I_{(\delta ,\infty )}(Z_{n}), \end{aligned}$$
where $\varGamma ^{\delta }=\{{\mathbf x}:~|\mu _{\mathbf {\beta }_{0},h}({\mathbf x}^{\tau }\mathbf {\beta }_{0})-c|\le \delta \}$ and
$$\begin{aligned} Z_{n}=\max _{1\le i\le n}\sup _{\mathbf {\beta }\in \mathscr {B}_{n},h\in \mathscr {H}_{n}} \left| \widehat{\mu }_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }) -\mu _{\mathbf {\beta }_{0},h}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_{0})\right| . \end{aligned}$$
(b)
Assume $d_{n}=o(1/\sqrt{n})$, and there exists a sequence $\delta _{n}\rightarrow 0$ such that $\delta _{n}/n^{-a\varepsilon }\rightarrow \infty $ and $\delta _{n}[d_{n}\sqrt{n}]^{-a\varepsilon }\rightarrow \infty $, for some $a>0$, then $I_{(\delta _{n},\infty )}(Z_{n})=o_{p}(n^{-\alpha })$, for all $\alpha >0$. Moreover, together with assumptions $(I_2)$–$(I_4)$, assuming that $E(|Y|^2)<\infty $, we have
$$\begin{aligned} \max _{1\le i\le n}\sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}| \widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}\mathbf {\beta }) -g_{\mathbf {\beta }}({\mathbf X}_{i}\mathbf {\beta })|I_{\varGamma }({\mathbf X}_{i})\rightarrow 0\quad a.s.\; \text{ as } n\rightarrow \infty \text{, } \end{aligned}$$
and
$$\begin{aligned} \max _{1\le i\le n}\sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}|\nabla _{\mathbf {\beta }}[\widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}\mathbf {\beta })]-\nabla _{\mathbf {\beta }}[g_{\mathbf {\beta }}({\mathbf X}_{i}\mathbf {\beta })]|I_{\varGamma }({\mathbf X}_{i})\rightarrow 0\quad a.s.\; \text{ as } n\rightarrow \infty \text{. } \end{aligned}$$

Proof of Lemma 1

(i) By definition, we have $A_{n}(\hat{\alpha }_{n})\le A_{n}(\alpha _{0,n})$ and $E(A_{n}(\alpha _{0,n}))\le E(A_{n}(\hat{\alpha }_{n}))$. These inequalities give $A_{n}(\hat{\alpha }_{n})-E(A_{n}(\hat{\alpha }_{n}))\le A_{n}(\hat{\alpha }_{n})-E(A_{n}(\alpha _{0,n}))\le A_{n}(\alpha _{0,n})-E(A_{n}(\alpha _{0,n}))$. Thus,

$$\begin{aligned} |A_{n}(\hat{\alpha }_{n})-E(A_{n}(\alpha _{0,n}))|\le & {} \max \{|A_{n}(\hat{\alpha }_{n})-E(A_{n}(\hat{\alpha }_{n}))|, |A_{n}(\alpha _{0,n})-E(A_{n}(\alpha _{0,n}))|\}\\\le & {} \sup _{\alpha \in \varTheta }|A_{n}(\alpha )-E(A_{n}(\alpha ))|. \end{aligned}$$

Since $\alpha _{0,n}$ is unique for any fixed n, $\alpha _{0,n}\rightarrow \alpha _{0}$ and $\displaystyle \sup _{\alpha \in \varTheta }|A_{n}(\alpha )-E(A_{n}(\alpha ))|\rightarrow 0\;\;a.s.$ as $n\rightarrow \infty $, we have $\hat{\alpha }_{n}\rightarrow \alpha _{0}\;\;a.s.$ as $n\rightarrow \infty $. $\square $

Proof of Lemma 2

We provide the proof of equation (9) and those of Eqs. (10) and (11) could be obtained using similar arguments. By Chebyshev’s inequality, we have, for any $\varepsilon >0$,

$$\begin{aligned} P_{\mathbf {\beta }_0}\left( \sqrt{n}\Vert \widetilde{S}_{n}(\mathbf {\beta }_0) -S_{n}(\mathbf {\beta }_0)\Vert >\varepsilon \right) \le \frac{1}{\varepsilon ^2} E\left\{ n\big \Vert \widetilde{S}_{n}(\mathbf {\beta }_0)-S_{n}(\mathbf {\beta }_0)\big \Vert ^2\right\} . \end{aligned}$$

Setting $a_{ni}(\mathbf {\beta }_0)=R(\nu _{ni}(\mathbf {\beta }_0))/(n+1)$, $b_{ni}(\mathbf {\beta }_0)=R(z_{i}(\mathbf {\beta }_0))/(n+1)$, let us introduce the following notation: $\psi _{i}(\mathbf {\beta }_0)=\varphi (a_{ni}(\mathbf {\beta }_0)) -\varphi (b_{ni}(\mathbf {\beta }_0))$ and $U_{i}(\mathbf {\beta }_0)=I_{\varGamma _{n}}({\mathbf X}_{i})\nabla _{\mathbf {\beta }_0} [\widehat{g}_{\mathbf {\beta }_{0},h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)] -\,I_{\varGamma }({\mathbf X}_{i})\nabla _{\mathbf {\beta }_0}[g_{\mathbf {\beta }_{0}} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]$.

$$\begin{aligned}&E\left\{ n\big \Vert \widetilde{S}_{n}(\mathbf {\beta }_0)-S_{n}(\mathbf {\beta }_0)\big \Vert ^2\right\} \\&\quad =\frac{1}{n\varepsilon ^2}E\left[ \left( \sum _{i=1}^{n}\left\{ I_{\varGamma _{n}} ({\mathbf X}_{i})\nabla _{\mathbf {\beta }_0}[\widehat{g}_{\mathbf {\beta }_{0},h}^{i} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]\varphi \left( a_{ni}(\mathbf {\beta }_0)\right) \right. \right. \right. \\&\qquad \left. \left. \left. -\,I_{\varGamma }({\mathbf X}_{i})\nabla _{\mathbf {\beta }_0}[g_{\mathbf {\beta }_{0}} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]\varphi \left( b_{ni}(\mathbf {\beta }_0)\right) \right\} \right) ^2\right] \\&\quad =\frac{1}{\varepsilon ^2}E\left[ \frac{1}{n}\sum _{i=1}^{n}U_{i}^{2} (\mathbf {\beta }_0)\varphi ^{2}\left( a_{ni}(\mathbf {\beta }_0)\right) \right] \\&\qquad +\,\frac{1}{\varepsilon ^2}E\left[ \frac{1}{n}\sum _{i=1}^{n} \left\{ \nabla _{\mathbf {\beta }_0}[g_{\mathbf {\beta }_{0}}({\mathbf X}_{i}^{\tau } \mathbf {\beta }_0)]\right\} ^{2}\psi _{i}^{2}(\mathbf {\beta }_0)\right] \\&\qquad +\,\frac{2}{\varepsilon ^2}E\left[ \frac{1}{n}\sum _{i<j}^{n}\left\{ \nabla _{\mathbf {\beta }_0}[g_{\mathbf {\beta }_{0}}({\mathbf X}_{i}^{\tau } \mathbf {\beta }_0)]\varphi \left( a_{nj}(\mathbf {\beta }_0)\right) \right\} U_{i}(\mathbf {\beta }_0)\psi _{j}(\mathbf {\beta }_0)\right] \\&\quad =J_{1n}+J_{2n}+J_{3n}. \end{aligned}$$

We now show that $J_{in}\rightarrow 0$ as $n\rightarrow \infty $, for $i=1,2,3$. Indeed, from the boundedness of $\varphi $, there exists a positive constant L such that $|\varphi (u)|\le L$, for all $u\in (0,1)$. Also

$$\begin{aligned} |\psi _i(\mathbf {\beta }_0)|\le & {} |\varphi (a_{ni}(\mathbf {\beta }_0))-\varphi (F_{\nu }(\nu _{ni}(\mathbf {\beta }_0)))|\\&+\,|\varphi (F_{\nu }(\nu _{ni}(\mathbf {\beta }_0)))-\varphi (F(z_{i} (\mathbf {\beta }_0)))|+|\varphi (F(z_{i}(\mathbf {\beta }_0)))-\varphi (b_{ni}(\mathbf {\beta }_0))|. \end{aligned}$$

For $i=1,\ldots ,n$, $F_{\nu }(\nu _{ni}(\mathbf {\beta }_0))$ and $F(z_{i}(\mathbf {\beta }_0))$ are independent uniformly distributed in (0, 1) random variables. Following Chapter 6 of Hájek et al. (1999), it is obtained that $\nu _{ni}(\mathbf {\beta }_0)-F_{\nu }(\nu _{ni}(\mathbf {\beta }_0))\rightarrow 0\;a.s.$ and $b_{ni}(\mathbf {\beta }_0)-F(z_{i}(\mathbf {\beta }_0))\rightarrow 0\;a.s.$, for each i. Thus, by continuity of $\varphi $ and by Lemma 3, we have $\varphi (a_{ni}(\mathbf {\beta }_0))-\varphi (F_{\nu }(\nu _{ni}(\mathbf {\beta }_0)))\rightarrow 0\;a.s.$ and $\varphi (F(z_{i}(\mathbf {\beta }_0)))-\varphi (b_{ni}(\mathbf {\beta }_0))\rightarrow 0\;a.s.$, for each i. Also, by Lemma 3, we have $\nu _{ni}(\mathbf {\beta }_0)-z_{i}(\mathbf {\beta }_0)\rightarrow 0\;a.s.$, from which by the continuity of the probability measure and the continuity of $\varphi $, we have $\varphi (F_{\nu }(\nu _{ni}(\mathbf {\beta }_0)))-\varphi (F(z_{i}(\mathbf {\beta }_0)))\rightarrow 0\;a.s.$, for each i. On the other hand,

$$\begin{aligned} \Vert U_{i}(\mathbf {\beta }_0)\Vert&\le |I_{\varGamma _{n}}({\mathbf X}_{i}) -I_{\varGamma }({\mathbf X}_{i})|\Vert \nabla _{\mathbf {\beta }_0}[\widehat{g}_{\mathbf {\beta }_{0},h}^{i} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\Vert +\Vert \nabla _{\mathbf {\beta }_0} [\widehat{g}_{\mathbf {\beta }_{0},h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]\\&\quad -\,\nabla _{\mathbf {\beta }_0}[g_{\mathbf {\beta }_{0}}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)] \Vert I_{\varGamma } ({\mathbf X}_{i}). \end{aligned}$$

For ${\mathbf X}_{i}\in \varGamma $ and for all $\varepsilon >0$, there exists $N>0$ such that for all $n\ge N$,

$$\begin{aligned} \Vert \nabla _{\mathbf {\beta }_0}[\widehat{g}_{\mathbf {\beta }_{0},h}^{i} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\Vert <\Vert \nabla _{\mathbf {\beta }_0} [g({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]\Vert +\varepsilon \le J({\mathbf X}_{i})+\varepsilon . \end{aligned}$$

$\varepsilon $ being arbitrary, letting $\varepsilon \rightarrow 0$, we have $\Vert \nabla _{\mathbf {\beta }_0}[\widehat{g}_{\mathbf {\beta }_{0},h}^{i} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\Vert \le J({\mathbf X}_{i})<\infty \;a.s. $, as J is integrable. Thus, by Lemma 3, $|I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i})|\Vert \nabla _{\mathbf {\beta }_0} [\widehat{g}_{\mathbf {\beta }_{0},h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\Vert \rightarrow 0\;a.s.$ and $\Vert \nabla _{\mathbf {\beta }_0}[\widehat{g}_{\mathbf {\beta }_{0},h}^{i} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]-\nabla _{\mathbf {\beta }_0}[g_{\mathbf {\beta }_{0}} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]\Vert I_{\varGamma }({\mathbf X}_{i})\rightarrow 0\;a.s.$, for all i. Therefore, $\Vert U_{i}(\mathbf {\beta }_0)\Vert \rightarrow 0\;a.s.$, for all i. Then

$$\begin{aligned} \Vert J_{1n}\Vert \le \frac{L^2}{\varepsilon ^2}E\left( \max _{1\le i\le n} \left\| U_{i}(\mathbf {\beta }_0)\right\| ^{2}\right) \rightarrow 0\;a.s., \end{aligned}$$

by applying the dominated convergence theorem together with Lemma 3. Next, using Cauchy–Schwarz inequality, we have

$$\begin{aligned} \big \Vert J_{2n}\big \Vert\le & {} \frac{1}{\varepsilon ^2}E\left[ \frac{1}{n}\sum _{i=1}^{n}J^{2}({\mathbf X}_{i}) \left| \psi _i(\mathbf {\beta }_0)\right| ^2\right] \\\le & {} \frac{1}{\varepsilon ^2}E\left[ \left( \frac{1}{n}\sum _{i=1}^{n} J^{4}({\mathbf X}_{i})\right) ^{1/2}\left( \max _{1\le i\le n}\left| \psi _i(\mathbf {\beta }_0)\right| ^4\right) ^{1/2}\right] . \end{aligned}$$

By the strong law of large numbers (SLLN), $n^{-1}\sum _{i=1}^{n}J^{4}({\mathbf X}_{i}) \rightarrow E\{J^{4}({\mathbf X})\}<\infty \;a.s.$ Also, from the above discussion, $\max _{1\le i\le n}\left| \psi _i(\mathbf {\beta }_0)\right| ^4\rightarrow 0\;a.s.$ Thus, applying the dominated convergence theorem once again, we have $J_{2n}\rightarrow 0\;a.s.$ Moreover, using the simple inequality $ab\le (a^2+b^2)/2$ together with Cauchy–Schwarz inequality, we have

$$\begin{aligned} \Vert J_{3n}\Vert\le & {} \frac{2L}{\varepsilon ^2}E\left[ \frac{1}{n}\sum _{i<j}^{n} J({\mathbf X}_{i})\left\| U_{i}(\mathbf {\beta }_0)\right\| \left| \psi _{i}(\mathbf {\beta }_0) \right| \right] \\\le & {} \frac{L}{\varepsilon ^2}E\left[ \frac{1}{n}\sum _{i=1}^{n}J^{2}({\mathbf X}_{i}) \left\| U_{i}(\mathbf {\beta }_0)\right\| ^{2}\right] +\frac{L}{\varepsilon ^2}E \left[ \frac{1}{n}\sum _{j=1}^{n}\left| \psi _{i}(\mathbf {\beta }_0)\right| ^{2}\right] \\\le & {} \frac{L}{\varepsilon ^2}E\left[ \left( \frac{1}{n}\sum _{i=1}^{n}J^{4} ({\mathbf X}_{i})\right) ^{1/2}\left( \max _{1\le i\le n}\left\| U_{i}(\mathbf {\beta }_0) \right\| ^{4}\right) ^{1/2}\right] \\&+\, \frac{L}{\varepsilon ^2}E \left[ \max _{1\le j\le n}\left| \psi _{i}(\mathbf {\beta }_0)\right| ^{2}\right] . \end{aligned}$$

By Lemma 3, $\displaystyle \max _{1\le i\le n}\left\| U_{i}(\mathbf {\beta }_0)\right\| ^{4}\rightarrow 0\;a.s.$, and again, by the SLLN, $\displaystyle \frac{1}{n}\sum _{i=1}^{n}J^{4}({\mathbf X}_{i})$ converges almost surely to $E\{J^{4}({\mathbf X})\}<\infty .$ Also, as before, $\displaystyle \max _{1\le i\le n}\left| \psi _{i}(\mathbf {\beta }_0)\right| ^2\rightarrow 0\;a.s.$ To this end, once again, a direct application of the dominated convergence theorem gives $J_{3n}\rightarrow 0\;a.s.$ and consequently, $\displaystyle \lim _{n\rightarrow \infty } P_{\mathbf {\beta }_0}\left( \sqrt{n}\Vert \widetilde{S}_{n} (\mathbf {\beta }_0)-S_{n}(\mathbf {\beta }_0)\Vert >\varepsilon \right) =0$. $\square $

Proof of Theorem 1

In this proof, we take L to be an arbitrary positive constant not necessarily the same, and as in the proof of Lemma 2, set $b_{ni}(\mathbf {\beta })=R(\nu _{ni}(\mathbf {\beta }))/(n+1)$ and $a_{ni}(\mathbf {\beta })=R(z_{i}(\mathbf {\beta }))/(n+1)$. By definition of $\widetilde{D}_{n}(\mathbf {\beta })$ and $D_{n}(\mathbf {\beta })$, we have

$$\begin{aligned} \widetilde{D}_{n}(\mathbf {\beta })-D_{n}(\mathbf {\beta })= & {} \frac{1}{n}\sum _{i=1}^{n} \left[ I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i})\right] \varphi (b_{ni} (\mathbf {\beta }))\nu _{ni}(\mathbf {\beta })\\&+\,\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi (b_{ni} (\mathbf {\beta }))\nu _{ni}(\mathbf {\beta })-\varphi (a_{ni}(\mathbf {\beta })) z_{i}(\mathbf {\beta })\right] \nonumber . \end{aligned}$$

(14)

Considering the first term to the right-hand side of Eq. (14), we have

$$\begin{aligned}&\left| \frac{1}{n}\sum _{i=1}^{n}\left[ I_{\varGamma _{n}}({\mathbf X}_{i}) -I_{\varGamma }({\mathbf X}_{i})\right] \varphi (b_{ni}(\mathbf {\beta }))\nu _{ni} (\mathbf {\beta })\right| \\&\quad \le \frac{1}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}} ({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i})\right| |\varphi (b_{ni}(\mathbf {\beta }))||\nu _{ni} (\mathbf {\beta })|\\&\quad \le \frac{L}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i}) -I_{\varGamma }({\mathbf X}_{i})\right| |\nu _{ni}(\mathbf {\beta })|\\&\quad \le \frac{L}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| |Y_{i}|\\&\qquad +\, \frac{L}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| |\widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }) -g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|\\&\qquad +\, \frac{L}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| |g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|, \end{aligned}$$

where L is the bound of $\varphi $, as $\varphi $ is assumed bounded by assumption $(I_1)$. By Cauchy–Schwarz inequality, we have

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| |Y_{i}|\le & {} \left( \frac{1}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| ^{2}\right) ^{1/2}\left( \frac{1}{n} \sum _{i=1}^{n}|Y_{i}|^{2}\right) ^{1/2}. \end{aligned}$$

The strong law of large numbers gives $n^{-1}\sum _{i=1}^{n}|Y_{i}|^{2}\rightarrow E[|Y|^{2}]<\infty \;a.s.$ On the other hand, we have,

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| ^{2}\le & {} \max _{1\le i\le n}\sup _{h\in \mathscr {H}_{n}} \left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i})\right| ^{2}\\\le & {} \max _{1\le i\le n} \sup _{\mathbf {\beta }\in \mathscr {B}_{n},h\in \mathscr {H}_{n}}\left| I_{\{{\mathbf x}:\widehat{\mu }_{\mathbf {\beta },h}^{i}({\mathbf x}^{\tau } \mathbf {\beta })\ge c\}}({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i})\right| ^{2}\\&\rightarrow 0\quad a.s., \end{aligned}$$

as $n\rightarrow \infty $, by Lemma 3. Similarly,

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| |\widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }) -g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|\\&\quad \le \left( \frac{1}{n} \sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i}) \right| ^{2}\right) ^{1/2}\left( \frac{1}{n}\sum _{i=1}^{n}| \widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta })-g_{\mathbf {\beta }} ({\mathbf X}_{i}^{\tau }\mathbf {\beta })|^{2}I_{\varGamma }({\mathbf X}_{i})\right) ^{1/2}\\&\quad \le \left( \max _{1\le i\le n} \sup _{\mathbf {\beta }\in \mathscr {B}_{n}, h\in \mathscr {H}_{n}}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| ^{2}\right) ^{1/2}\\&\qquad \times \left( \max _{1\le i\le n} \sup _{\mathbf {\beta }\in \mathscr {B}_{n},h\in \mathscr {H}_{n}}| \widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }) -g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|^{2}I_{\varGamma }({\mathbf X}_{i})\right) ^{1/2}. \end{aligned}$$

Again, by Lemma 3,

$$\begin{aligned} \max _{1\le i\le n} \sup _{\mathbf {\beta }\in \mathscr {B}_{n},h\in \mathscr {H}_{n}} \left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i})\right| ^{2}\rightarrow 0\;a.s. \end{aligned}$$

and

$$\begin{aligned} \max _{1\le i\le n} \sup _{\mathbf {\beta }\in \mathscr {B}_{n},h\in \mathscr {H}_{n}}| \widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }) -g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|^{2}I_{\varGamma }({\mathbf X}_{i})\rightarrow 0\;a.s. \end{aligned}$$

Thus, $n^{-1}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| |\widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }) -g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|\rightarrow 0\;a.s.$ Moreover,

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| |g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|&\le \left( \frac{1}{n}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma } ({\mathbf X}_{i})\right| ^{2}\right) ^{1/2}\\&\quad \times \left( \frac{1}{n}\sum _{i=1}^{n}|g_{\mathbf {\beta }} ({\mathbf X}_{i}^{\tau }\mathbf {\beta })|^{2}\right) ^{1/2}. \end{aligned}$$

Following the same argument as above, we have $n^{-1}\sum _{i=1}^{n}\left| I_{\varGamma _{n}}({\mathbf X}_{i})-I_{\varGamma }({\mathbf X}_{i})\right| ^{2}\rightarrow 0\;a.s.$, and a direct application of the strong of large numbers gives

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^{n}|g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|^{2}\rightarrow E\{|g_{\mathbf {\beta }}({\mathbf X}^{\tau }\mathbf {\beta })|^2\}\le E\{J^{2}({\mathbf X})\}<\infty \;a.s.,\\&\quad \text{ by } \text{ assumption } (I_3)-\mathrm{(iii)}. \end{aligned}$$

When it comes to the second term on the right-hand side of Eq. (14), it can be further decomposed as follows

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi (b_{ni} (\mathbf {\beta }))\nu _{ni}(\mathbf {\beta })-\varphi (a_{ni}(\mathbf {\beta })) z_{i}(\mathbf {\beta })\right] \\&\quad =\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i}) \varphi (b_{ni}(\mathbf {\beta }))[\nu _{ni}(\mathbf {\beta })-z_{i}(\mathbf {\beta })]\\&\qquad +\,\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi (b_{ni} (\mathbf {\beta }))-\varphi (a_{ni}(\mathbf {\beta }))\right] z_{i}(\mathbf {\beta }). \end{aligned}$$

Considering the first term to the right-hand side of this equation, we have

$$\begin{aligned}&\left| \frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\varphi (b_{ni} (\mathbf {\beta }))[\nu _{ni}(\mathbf {\beta })-z_{i}(\mathbf {\beta })]\right| \le \frac{L}{n}\sum _{i=1}^{n}|\widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau } \mathbf {\beta })-g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|I_{\varGamma }({\mathbf X}_{i})\\&\quad \le \max _{1\le i\le n} \sup _{\mathbf {\beta }\in \mathscr {B}_{n},h\in \mathscr {H}_{n}}|\widehat{g}_{\mathbf {\beta },h}^{i}({\mathbf X}_{i}^{\tau }\mathbf {\beta }) -g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|I_{\varGamma }({\mathbf X}_{i}) \end{aligned}$$

which converges to 0 a.s. by Lemma 3. Now, let’s set $F_{i\nu }(s)=P(\nu _{in}(\mathbf {\beta })\le s)$ and $F_{i}(s)=P(z_{i}(\mathbf {\beta })\le s)$. Then,

$$\begin{aligned} \varphi (b_{ni}(\mathbf {\beta }))-\varphi (a_{ni}(\mathbf {\beta }))= & {} [\varphi (b_{ni}(\mathbf {\beta }))-\varphi (F_{i\nu }(\nu _{in}(\mathbf {\beta })))]\\&+\,[\varphi (F_{i\nu }(\nu _{in}(\mathbf {\beta })))-\varphi (F_{i}(z_{i}(\mathbf {\beta })))]\\&+\,[\varphi (F_{i}(z_{i}(\mathbf {\beta })))-\varphi (a_{ni}(\mathbf {\beta }))]. \end{aligned}$$

As in the proof of Lemma 2, since for $i=1,\ldots ,n$ and for all $\mathbf {\beta }\in \mathscr {B}$, $F_{i\nu }(\nu _{in}(\mathbf {\beta }))$ and $F_{i}(z_{i}(\mathbf {\beta }))$ are independent uniformly distributed random variables on (0, 1), following Hájek et al. (1999), we have, $b_{ni}(\mathbf {\beta })-F_{i\nu }(\nu _{in}(\mathbf {\beta }))\rightarrow 0\;a.s.$ and $a_{ni}(\mathbf {\beta })-F_{i}(z_{i}(\mathbf {\beta }))\rightarrow 0\;a.s.$, for each i. Applying the generalized continuous mapping theorem (Whitt 2011), we have $\varphi (b_{ni}(\mathbf {\beta }))-\varphi (F_{i\nu }(\nu _{in}(\mathbf {\beta })))\rightarrow 0\;a.s.$ and $\varphi (F_{i}(z_{i}(\mathbf {\beta })))-\varphi (a_{ni}(\mathbf {\beta }))\rightarrow 0\;a.s.$, for each i and for all $\mathbf {\beta }\in \mathscr {B}$. Also, since $\nu _{in}(\mathbf {\beta })-z_{i}(\mathbf {\beta })\rightarrow 0\;a.s.$, by the continuity of the probability measure and the continuity of $\varphi $, we have $\varphi (F_{i\nu }(\nu _{in}(\mathbf {\beta })))-\varphi (F_{i}(z_{i}(\mathbf {\beta })))\rightarrow 0\;a.s.$, for each i and for all $\mathbf {\beta }\in \mathscr {B}$. Thus,

$$\begin{aligned}&\Big |\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi (b_{ni} (\mathbf {\beta }))-\varphi (a_{ni}(\mathbf {\beta }))\right] z_{i}(\mathbf {\beta })\Big |\\&\le \left( \frac{1}{n}\sum _{i=1}^{n}|\varphi (b_{ni}(\mathbf {\beta })) -\varphi (a_{ni}(\mathbf {\beta }))|^{2}\right) ^{1/2}\left( \frac{1}{n}\sum _{i=1}^{n} |z_{i}(\mathbf {\beta })|^2\right) ^{1/2}. \end{aligned}$$

From this, we have

$$\begin{aligned} n^{-1}\sum _{i=1}^{n}|\varphi (b_{ni} (\mathbf {\beta }))-\varphi (a_{ni}(\mathbf {\beta }))|^{2}\le \max _{1\le i\le n} \sup _{\mathbf {\beta }\in \mathscr {B}}|\varphi (b_{ni}(\mathbf {\beta }))-\varphi (a_{ni} (\mathbf {\beta }))|^{2}, \end{aligned}$$

which converges almost surely to zero. Furthermore,

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}|z_{i}(\mathbf {\beta })|^2\le & {} \frac{1}{n} \sum _{i=1}^{n}\left( |Y_{i}|+|J({\mathbf X}_{i})|\right) ^{2} \le \frac{1}{n}\sum _{i=1}^{n}|Y_{i}|^{2}+\frac{1}{n}\sum _{i=1}^{n}| J({\mathbf X}_{i})|^{2} \nonumber \\&+\,2\left( \frac{1}{n}\sum _{i=1}^{n}|Y_{i}|^{2}\right) ^{1/2}\left( \frac{1}{n} \sum _{i=1}^{n}J^{2}({\mathbf X}_{i})\right) ^{1/2} := J_{4n}. \end{aligned}$$

(15)

By the strong law of large numbers, the entire expression on the right-hand side of this inequality converges a.s. to $E\{|Y|^2\}+E\{J^{2}({\mathbf X})\}+2\left( E\{|Y|^2\}E\{J^{2}({\mathbf X})\}\right) ^{1/2}<\infty $, by assumptions $(I_2)$–(iii) and $(I_4)$. Thus,

$$\begin{aligned} \sup _{\mathbf {\beta }\in \mathscr {B}}\left| \frac{1}{n}\sum _{i=1}^{n}I_{\varGamma } ({\mathbf X}_{i})\left[ \varphi (b_{ni}(\mathbf {\beta }))-\varphi (a_{ni}(\mathbf {\beta }))\right] z_{i}(\mathbf {\beta })\right| \rightarrow 0\;a.s. \end{aligned}$$

Now, combining all these facts, we have $\displaystyle \sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}|\widetilde{D}_{n}(\mathbf {\beta })-D_{n}(\mathbf {\beta })|\ \rightarrow \ 0\;\;a.s.$ $\square $

Proof of Theorem 2

Note that $\varphi $ has a bounded first derivative. So, $\varphi \in Lip(1)$. Moreover, by $(I_2)$–(iii) and $(I_4)$, we have $\text{ Var }(z_i(\mathbf {\beta })) < \infty $, for all i and $\mathbf {\beta } \in \mathscr {B}$. Then

$$\begin{aligned} \sum _{i = 1}^n \frac{\text{ Var }(z_i(\mathbf {\beta }))}{n^2} \le \frac{\sigma ^2_\mathrm{{max}}(\mathbf {\beta })}{n} = O(1/n), \end{aligned}$$

where $\sigma ^2_\mathrm{{max}}(\mathbf {\beta }) = \max \{\text{ Var }(z_1(\mathbf {\beta })), \ldots , \text{ Var }(z_n(\mathbf {\beta }))\}$. Setting $\alpha _n = 1/n$ and $\beta =1$ in the theorem of (Xiang 1995), we find that for every $\mathbf {\beta } \in \mathscr {B}$, $D_n(\mathbf {\beta }) - E\{D_n(\mathbf {\beta })\} \rightarrow 0 \ a.s.$

To complete the proof, we have to show that $\{D_{n}(\mathbf {\beta })\}_{n\ge 1}$ is stochastically equicontinuous. To that end, taking $\mathbf {\beta }_{1},\mathbf {\beta }_{2}\in \mathscr {B}$, we have

$$\begin{aligned}&D_{n}(\mathbf {\beta }_{1})-D_{n}(\mathbf {\beta }_{2})\\&\quad =\frac{1}{n} \sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi \left( \frac{R(z_{i}(\mathbf {\beta }_1))}{n+1}\right) z_{i}(\mathbf {\beta }_{1})-\varphi \left( \frac{R(z_{i}(\mathbf {\beta }_2))}{n+1}\right) z_{i}(\mathbf {\beta }_{2})\right] . \end{aligned}$$

As in the proof of Theorem 1, set $a_{ni}(\mathbf {\beta })=R(z_{i}(\mathbf {\beta }))/(n+1)$. Then,

$$\begin{aligned} D_{n}(\mathbf {\beta }_{1})-D_{n}(\mathbf {\beta }_{2})= & {} \frac{1}{n} \sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi \left( a_{ni} (\mathbf {\beta }_1)\right) z_{i}(\mathbf {\beta }_{1})-\varphi \left( a_{ni} (\mathbf {\beta }_2)\right) z_{i}(\mathbf {\beta }_{2})\right] \\= & {} \frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\varphi \left( a_{ni} (\mathbf {\beta }_1)\right) \left[ z_{i}(\mathbf {\beta }_{1})-z_{i}(\mathbf {\beta }_{2})\right] \\&+\,\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi \left( a_{ni} (\mathbf {\beta }_1)\right) -\varphi \{F_{i}(z_{i}(\mathbf {\beta }_1))\}\right] z_{i} (\mathbf {\beta }_2)\\&+\,\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi \{F_{i}(z_{i} (\mathbf {\beta }_1))\}-\varphi \{F_{i}(z_{i}(\mathbf {\beta }_2))\}\right] z_{i}(\mathbf {\beta }_2)\\&+\, \frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\left[ \varphi \{F_{i}(z_{i} (\mathbf {\beta }_2))\}-\varphi \left( a_{ni}(\mathbf {\beta }_2)\right) \right] z_{i}(\mathbf {\beta }_2). \end{aligned}$$

Note that $z_{i}(\mathbf {\beta }_1)-z_{i}(\mathbf {\beta }_2)=g_{\mathbf {\beta }_{1}} ({\mathbf X}_{i}^{\tau }\mathbf {\beta }_{1})-g_{\mathbf {\beta }_{2}}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_{2})$. Since $g_{\mathbf {\beta }}(\cdot )$ is differentiable with respect to $\mathbf {\beta }$, applying the mean value theorem on the function $g_{\mathbf {\beta }}({\mathbf X}^{\tau }\mathbf {\beta })$, there exists $\mathbf {\xi }=\lambda \mathbf {\beta }_{1}+(1-\lambda )\mathbf {\beta }_{2}$ for some $\lambda \in (0,1)$ such that

$$\begin{aligned} g_{\mathbf {\beta }_{1}}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_{1}) -g_{\mathbf {\beta }_{2}}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_{2}) =\nabla _{\mathbf {\xi }}[g_{\mathbf {\xi }}({\mathbf X}_{i}^{\tau }\mathbf {\xi })] (\mathbf {\beta }_{1}-\mathbf {\beta }_{2}). \end{aligned}$$

Then, by assumption $(I_2)$–(iii) we have

$$\begin{aligned} |g_{\mathbf {\beta }_{1}}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_{1}) -g_{\mathbf {\beta }_{2}}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_{2})|=|\nabla _{\mathbf {\xi }}[g_{\mathbf {\xi }}({\mathbf X}_{i}^{\tau }\mathbf {\xi })](\mathbf {\beta }_{1}-\mathbf {\beta }_{2})|\le J({\mathbf X}_{i})\Vert \mathbf {\beta }_{1}-\mathbf {\beta }_{2}\Vert . \end{aligned}$$

Furthermore, set $h_{i}(\mathbf {\beta })=\varphi \{F_{i}(z_{i}(\mathbf {\beta }))\}=\varphi \{F_{i}(Y_{i} -g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta }))\}$, where $F_{i}$ is a cumulative distribution function of $z_{i}(\mathbf {\beta })$, and therefore almost surely differentiable. So by the mean value theorem, there exists $\eta =\lambda \mathbf {\beta }_{1}+(1-\lambda )\mathbf {\beta }_{2}$ for $\lambda \in (0,1)$ such that $h_{i}(\mathbf {\beta }_{1})-h_{i}(\mathbf {\beta }_{2})=h'_{i}(\eta )(\mathbf {\beta }_{1} -\mathbf {\beta }_{2})$, with $h'_{i}(\eta )=-\nabla _{\eta }[g_{\eta }({\mathbf X}_{i}^{\tau }\eta )]f_{i}(z_{i}(\eta )) \varphi ^{\prime }\{F_{i}(z_{i}(\eta ))\}$ and $f_{i}(t)=dF_{i}(t)/dt$. It is worth pointing out that $f_{i}$ being a density is almost surely bounded. Thus, by assumption $(I_2)-iii)$ again together with the boundedness of $\varphi ^{\prime }$, we have $\Vert h'_{i}(\eta )\Vert \le MJ({\mathbf X}_{i})\; a.s.$, where M is such that $|f_{i}(z_{i}(\eta ))\varphi ^{\prime }\{F_{i}(z_{i}(\eta ))\}|\le M\;a.s.$ On the other hand, for $i=1,\ldots ,n$, $F_{i}(z_{i}(\mathbf {\beta }))$ being independent uniformly distributed in the interval (0, 1), for all $\mathbf {\beta }\in \mathscr {B}$, as in Theorem 1, following Hájek et al. (1999) again, it is obtained that $a_{ni}(\mathbf {\beta })-F_{i}(z_{i}(\mathbf {\beta }))\rightarrow 0\;a.s.$, for all $\mathbf {\beta }\in \mathscr {B}$ and for each i. By continuity of $\varphi $, we have $\varphi \left( a_{ni}(\mathbf {\beta })\right) -\varphi \{F_{i}(z_{i}(\mathbf {\beta }))\}\rightarrow 0\;a.s.$, for all $\mathbf {\beta }\in \mathscr {B}$ and for each i. Thus,

$$\begin{aligned} \max _{1\le i\le n}|\varphi \left( a_{ni}(\mathbf {\beta })\right) -\varphi \{F_{i}(z_{i}(\mathbf {\beta }))\}|\rightarrow 0\;a.s., \end{aligned}$$

for all $\mathbf {\beta }\in \mathscr {B}$. Now

$$\begin{aligned} \left| \frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\varphi \left( a_{ni} (\mathbf {\beta }_1)\right) \left[ z_{i}(\mathbf {\beta }_{1})-z_{i}(\mathbf {\beta }_{2}) \right] \right| \le \Vert \mathbf {\beta }_{1}-\mathbf {\beta }_{2}\Vert \frac{L}{n}\sum _{i=1}^{n} J({\mathbf X}_{i}), \end{aligned}$$

where L is such that $|\varphi (t)|\le L$, for all $t\in (0,1)$. Also, with probability 1, we have

$$\begin{aligned}&\left| \frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i}) \left[ \varphi \{F_{i}(z_{i}(\mathbf {\beta }_1))-\varphi \{F_{i}(z_{i} (\mathbf {\beta }_2))\}\right] z_{i}(\mathbf {\beta }_2)\right| \\ {}\le & {} \Vert \mathbf {\beta }_1 -\mathbf {\beta }_2\Vert \frac{M}{n}\sum _{i=1}^{n}J({\mathbf X}_{i})|z_{i}(\mathbf {\beta }_2)|\\\le & {} \Vert \mathbf {\beta }_1-\mathbf {\beta }_2\Vert M\left( \frac{1}{n}\sum _{i=1}^{n}J^{2} ({\mathbf X}_{i})\right) ^{1/2}\left( \frac{1}{n}\sum _{i=1}^{n}|z_{i} (\mathbf {\beta }_2)|^{2}\right) ^{1/2}\\\le & {} \Vert \mathbf {\beta }_1-\mathbf {\beta }_2\Vert M\left( \frac{1}{n}\sum _{i=1}^{n}J^{2} ({\mathbf X}_{i})\right) ^{1/2}\left( \frac{1}{n}\sum _{i=1}^{n}[|Y_{i}| +J({\mathbf X}_{i})]^{2}\right) ^{1/2}. \end{aligned}$$

Moreover,

$$\begin{aligned}&\left| \frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i}) \left[ \varphi \left( a_{ni}(\mathbf {\beta }_1)\right) -\varphi \{F_{i}(z_{i}(\mathbf {\beta }_1))\}\right] z_{i}(\mathbf {\beta }_2)\right| \\\le & {} \frac{1}{n}\sum _{i=1}^{n}\left| \varphi \left( a_{ni} (\mathbf {\beta }_1)\right) -\varphi \{F_{i}(z_{i}(\mathbf {\beta }_1))\}\right| |z_{i} (\mathbf {\beta }_2)|\\\le & {} \left( \max _{1\le i\le n}|\varphi \left( a_{ni}(\mathbf {\beta }_1)\right) -\varphi \{F_{i}(z_{i}(\mathbf {\beta }_1))\}|^{2}\right) ^{1/2}\left( \frac{1}{n}\sum _{i=1}^{n}[|Y_{i}|+J({\mathbf X}_{i})]^{2}\right) ^{1/2}\rightarrow 0\;a.s. \end{aligned}$$

as $\max _{1\le i\le n}|\varphi \left( a_{ni}(\mathbf {\beta }_1)\right) -\varphi \{F_{i}(z_{i}(\mathbf {\beta }_1))\}|^{2}\rightarrow 0\;\;a.s.$ and

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}\left( |Y_{i}|+|J({\mathbf X}_{i})|\right) ^{2}\le & {} J_{4n}, \end{aligned}$$

where $J_{4n}$, defined in Eq. (15), converges almost surely to a finite quantity by the strong law of large numbers under assumptions $(I_2)$–(iii) and $(I_4)$. Similarly,

$$\begin{aligned} \left| \frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i}) \left[ \varphi \{F_{i}(z_{i}(\mathbf {\beta }_2))\}-\varphi \left( a_{ni} (\mathbf {\beta }_2)\right) \right] z_{i}(\mathbf {\beta }_2)\right| \end{aligned}$$

converges almost surely to zero. Hence, with probability 1, we have

$$\begin{aligned} |D_{n}(\mathbf {\beta }_{1})-D_{n}(\mathbf {\beta }_{2})|\le B_n\Vert \mathbf {\beta }_{1} -\mathbf {\beta }_{2}\Vert , \end{aligned}$$

where

$$\begin{aligned} B_{n}=: \frac{L}{n}\sum _{i=1}^{n}J({\mathbf X}_{i})+\times M \left( \frac{1}{n}\sum _{i=1}^{n}J^{2}({\mathbf X}_{i})\right) ^{1/2}J_{4n}^{1/2}+o(1). \end{aligned}$$

For n large enough, $B_{n}$ does not depend on $\mathbf {\beta }$. From the fact that all terms in the definition of $B_{n}$ converge almost surely to a finite quantity, so does $B_{n}$. Therefore, $\{D_{n}(\mathbf {\beta })\}_{n\ge 1}$ is stochastically equicontinuous (Rao et al. 2014). $\square $

Proof of Theorem 4

Note that by Jensen inequality,

$$\begin{aligned} \sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}|E(\widetilde{D}_{n}(\mathbf {\beta })) -E(D_{n}(\mathbf {\beta }))|\le E\Big (\sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}|\widetilde{D}_{n}(\mathbf {\beta }) -D_{n}(\mathbf {\beta })|\Big ). \end{aligned}$$

(16)

Thus, together with Theorem 1, applying the dominated convergence theorem to the right-hand side of this inequality, we obtain the result. On the other hand,

$$\begin{aligned} \widetilde{D}_{n}(\mathbf {\beta })-E(\widetilde{D}_{n}(\mathbf {\beta }))= & {} \widetilde{D}_{n}(\mathbf {\beta })-D_{n}(\mathbf {\beta })+D_{n}(\mathbf {\beta }) -E(D_{n}(\mathbf {\beta }))+E(D_{n}(\mathbf {\beta }))\\&-E(\widetilde{D}_{n}(\mathbf {\beta })). \end{aligned}$$

Thus,

$$\begin{aligned} \sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}|\widetilde{D}_{n} (\mathbf {\beta })-E(\widetilde{D}_{n}(\mathbf {\beta }))|&\le \sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}|\widetilde{D}_{n}(\mathbf {\beta })-D_{n}(\mathbf {\beta })|\nonumber \\&\quad +\sup _{\mathbf {\beta }\in \mathscr {B}}|D_{n}(\mathbf {\beta })-E(D_{n}(\mathbf {\beta }))|\nonumber \\&\quad + \sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}|E(D_{n}(\mathbf {\beta })) -E(\widetilde{D}_{n}(\mathbf {\beta }))|.\nonumber \\ \end{aligned}$$

(17)

From Theorems 1, 2 and Eq. (16), the terms to the right-hand side of Eq. (17) converge to zero with probability 1. $\square $

Proof of Theorem 5

By assumption $(I_6)$, $\mathbf {\beta }_{0,n}=\mathop {{{\mathrm{Argmin}}}}\limits _{\mathbf {\beta }}E(D_{n}(\mathbf {\beta }))$ which implies that

$$\begin{aligned} E(D_{n}(\mathbf {\beta }_{0,n}))\le E(D_{n}(\mathbf {\beta })), \end{aligned}$$

for all $\mathbf {\beta }\in \mathscr {B}$. On the other hand, by Theorem 4, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{\mathbf {\beta }\in \mathscr {B},h\in \mathscr {H}_{n}}|E(\widetilde{D}_{n}(\mathbf {\beta })) -E(D_{n}(\mathbf {\beta }))|=0. \end{aligned}$$

Thus, $\forall ~\varepsilon >0$, there exists $N>0$ such that for all $n\ge N$, $|E(\widetilde{D}_{n}(\mathbf {\beta }))-E(D_{n}(\mathbf {\beta }))|<\varepsilon /2$ for all $\mathbf {\beta }\in \mathscr {B}$. This implies that

$$\begin{aligned} -\varepsilon /2+E(D_{n}(\mathbf {\beta }_{0,n}))<E(\widetilde{D}_{n}(\mathbf {\beta })). \end{aligned}$$

(18)

Also, for all $n\ge N$, $|E(D_{n}(\mathbf {\beta }_{0,n}))-E(\widetilde{D}_{n}(\mathbf {\beta }_{0,n}))|<\varepsilon /2$. Thus, we have

$$\begin{aligned} -\varepsilon /2+E\left( \widetilde{D}_{n}\left( \mathbf {\beta }_{0,n}\right) \right) <E\left( D_{n}\left( \mathbf {\beta }_{0,n}\right) \right) . \end{aligned}$$

(19)

Equations (19) in (18) gives $-\varepsilon +E(\widetilde{D}_{n}(\mathbf {\beta }_{0,n}))<E(\widetilde{D}_{n}(\mathbf {\beta }))$, for all $\mathbf {\beta }\in \mathscr {B}$ and for all $n\ge N$. Now $\varepsilon $ being arbitrary, letting $\varepsilon \rightarrow 0$, we have $E(\widetilde{D}_{n}(\mathbf {\beta }_{0,n}))\le E(\widetilde{D}_{n}(\mathbf {\beta }))$, for all $\mathbf {\beta }\in \mathscr {B}$ which completes the proof. $\square $

Proof of Theorem 6

Note that

$$\begin{aligned} S_{n}(\mathbf {\beta })-T_{n}(\mathbf {\beta })=\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma } ({\mathbf X}_{i})\nabla _{\mathbf {\beta }}[g_{\mathbf {\beta }}({\mathbf X}_{i}^{\tau }\mathbf {\beta })] \left[ \varphi \left( \frac{R(z_{i}(\mathbf {\beta }))}{n+1}\right) -\varphi \left( F_{i}(z_{i}(\mathbf {\beta }))\right) \right] . \end{aligned}$$

So,

$$\begin{aligned} |S_{n}(\mathbf {\beta })-T_{n}(\mathbf {\beta })|\le & {} \frac{1}{n}\sum _{i=1}^{n} J({\mathbf X}_{i})\left| \varphi \left( \frac{R(z_{i}(\mathbf {\beta }))}{n+1}\right) -\varphi \left( F_{i}(z_{i}(\mathbf {\beta }))\right) \right| \quad \text{ by } (I_2)-iii)\\\le & {} \left\{ \frac{1}{n}\sum _{i=1}^{n}J^{2}({\mathbf X}_{i})\right\} ^{1/2}\left\{ \max _{1\le i\le n}\sup _{\mathbf {\beta }\in \mathscr {B}}\left| \varphi \left( \frac{R(z_{i}(\mathbf {\beta }))}{n+1}\right) \right. \right. \\&\left. \left. -\varphi \left( F_{i}(z_{i}(\mathbf {\beta }))\right) \right| ^{2}\right\} ^{1/2}. \end{aligned}$$

By continuity of $\varphi $ and the fact that for $i=1,\ldots ,n$, $F_{i}(z_{i}(\mathbf {\beta }))$ are independent uniformly distributed in (0, 1), once again following (Hájek et al. 1999), we have $\left| \varphi \left( \frac{R(z_{i}(\mathbf {\beta }))}{n+1}\right) -\varphi \left( F_{i}(z_{i}(\mathbf {\beta }))\right) \right| \rightarrow 0\;a.s.$, for all i and $\mathbf {\beta }\in \mathscr {B}$. Thus,

$$\begin{aligned} \max _{1\le i\le n}\sup _{\mathbf {\beta }\in \mathscr {B}}\left| \varphi \left( \frac{R(z_{i} (\mathbf {\beta }))}{n+1}\right) -\varphi \left( F_{i}(z_{i}(\mathbf {\beta }))\right) \right| ^{2}\rightarrow 0\;a.s. \end{aligned}$$

On the other hand, $n^{-1}\sum _{i=1}^{n}J^{2}({\mathbf X}_{i})\rightarrow E[J^{2}({\mathbf X})]<\infty \;a.s.$ Hence, $\displaystyle \lim _{n\rightarrow \infty }\sup _{\mathbf {\beta }\in \mathscr {B}}|S_{n} (\mathbf {\beta })-T_{n}(\mathbf {\beta })|=0\;a.s.$ $\square $

Proof of Theorem 7

Note that

$$\begin{aligned} \nabla _{\mathbf {\beta }_0}T_{n}(\mathbf {\beta }_0)= & {} -\frac{1}{n} \sum _{i=1}^n I_{\varGamma }({\mathbf X}_{i})\nabla _{\mathbf {\beta }_0} [g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta })]\{\nabla _{\mathbf {\beta }_0} [g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]\}^{\tau }f(z_{i}(\mathbf {\beta }_{0}))\\&\varphi '(F(z_{i}(\mathbf {\beta }_{0})))+\,\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i})\nabla _{\mathbf {\beta }_0}^{2} [g_{\mathbf {\beta }}({\mathbf X}_i^{\tau }\mathbf {\beta }_{0})]\varphi (F(z_{i}(\mathbf {\beta }_{0}))). \end{aligned}$$

A direct application of the strong of large numbers shows that $\nabla _{\mathbf {\beta }_0}T_{n}(\mathbf {\beta }_0)\rightarrow \mathscr {{\mathbf W}}\;a.s.$ If we assume that ${\mathbf X}$ is independent of $\varepsilon $, we have

$$\begin{aligned} {\mathbf W}= & {} -E\{I_{\varGamma }({\mathbf X})\nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}^{\tau }\mathbf {\beta })\big ) [\nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}^{\tau }\mathbf {\beta }_0)\big )]^{\tau }\}E\{f(\varepsilon )\varphi '(F(\varepsilon ))\}\\&+\,E\{I_{\varGamma }({\mathbf X})\nabla _{\mathbf {\beta }_0}^{2} [g_{\mathbf {\beta }}({\mathbf X}^{\tau }\mathbf {\beta }_{0})]\}E\{\varphi (F(\varepsilon ))\}. \end{aligned}$$

But

$$\begin{aligned} E\left[ f(\varepsilon )\varphi '(F(\varepsilon ))\right] =\int _{-\infty }^{\infty }f (\varepsilon )\varphi '(F(\varepsilon ))dF(\varepsilon ) =-\int _{-\infty }^{\infty }f'(\varepsilon )\varphi (F(\varepsilon ))d\varepsilon , \end{aligned}$$

from integration by parts, since $f(\varepsilon )\varphi (F(\varepsilon ))\rightarrow 0$ as $\varepsilon \rightarrow \pm \infty $. Now, putting $u=F(\varepsilon )$, we have

$$\begin{aligned} \int _{-\infty }^{\infty }f'(\varepsilon )\varphi (F(\varepsilon ))d\varepsilon = -\int _{0}^{1}\varphi (u)\varphi _{f}(u)du=-\gamma _{\varphi }^{-1}. \end{aligned}$$

On the other have, by assumption $(I_1)$, $E\left[ \varphi \big (F(\varepsilon ))\big )\right] =\int _{0}^{1}\varphi (t)dt=0$. Thus,

$$\begin{aligned} {\mathbf W}=\gamma _{\varphi }^{-1}E\{I_{\varGamma }({\mathbf X})\nabla _{\mathbf {\beta }_0} \big (g_{\mathbf {\beta }_0}({\mathbf X}^{\tau }\mathbf {\beta })\big ) [\nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}^{\tau } \mathbf {\beta }_0)\big )]^{\tau }\}. \end{aligned}$$

On the other hand, to simplify notation, set ${\mathbf A}_{i}=\nabla _{\mathbf {\xi }}[g_{\mathbf {\xi }}({\mathbf X}_{i}^{\tau }\mathbf {\xi })]$, ${\mathbf B}_{i}=\nabla _{\mathbf {\xi }}^{2}[g_{\mathbf {\xi }}({\mathbf X}_{i}^{\tau }\mathbf {\xi })]$ and ${\mathbf C}_{i}=\nabla _{\mathbf {\xi }}^{3}[g_{\mathbf {\xi }}({\mathbf X}_{i}^{\tau }\mathbf {\xi })]$

$$\begin{aligned} \nabla _{\mathbf {\xi }}^{2}T_{n}(\mathbf {\xi })= & {} -\frac{3}{n}\sum _{i=1}^n I_{\varGamma } ({\mathbf X}_{i}){\mathbf B}_i{\mathbf A}_i^{\tau }f_{i}(z_{i}(\mathbf {\xi }))\varphi '(F_{i}(z_{i}(\mathbf {\xi })))\\&+\,\frac{1}{n}\sum _{i=1}^{n} I_{\varGamma }({\mathbf X}_{i}){\mathbf C}_i\varphi (F_{i}(z_{i}(\mathbf {\xi })))\\&+\,\frac{1}{n}\sum _{i=1}^{n} I_{\varGamma }({\mathbf X}_{i}){\mathbf A}_{i}{\mathbf A}_{i}^{\tau }{\mathbf A}_{i} f'_{i}(z_{i}(\mathbf {\xi }))\varphi '(F_{i}(z_{i}(\mathbf {\xi })))\\&+\,\frac{1}{n}\sum _{i=1}^{n} I_{\varGamma }({\mathbf X}_{i}){\mathbf A}_{i}{\mathbf A}_{i}^{\tau }{\mathbf A}_{i} f^{2}_{i}(z_{i}(\mathbf {\xi }))\varphi ''(F_{i}(z_{i}(\mathbf {\xi }))). \end{aligned}$$

From this, it can be easily shown that with each term to the right-hand side of this equation is bounded by

$$\begin{aligned} 3L n^{-1}\sum _{i=1}^{n}\exp \{\lambda \Vert {\mathbf X}_{i}\Vert \}[J({\mathbf X}_{i})+J^{2}({\mathbf X}_{i})+J^{3}({\mathbf X}_{i})], \end{aligned}$$

which converges almost surely to $3L\times E[\exp \{\lambda \Vert {\mathbf X}\Vert \}\{J({\mathbf X})+J^{2}({\mathbf X})+J^{3}({\mathbf X})\}]<\infty $, by the strong law of large numbers under $(I_2)$–(iii) and $(I_4)$. Thus, $\nabla _{\mathbf {\beta }}^{2}T_{n}(\mathbf {\xi })$ is almost surely bounded and the result follows from Theorem 2. $\square $

Proof of Theorem 8

We mimic the proof given in Hettmansperger and McKean (1998) for the linear model. Set

$$\begin{aligned} T_{n}(\mathbf {\beta }_{0})=\frac{1}{n}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i}) \nabla _{\mathbf {\beta }_0}[g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta })] \varphi [F(\varepsilon _{i}(\mathbf {\beta }_{0}))]. \end{aligned}$$

It follows by a routine argument that $\sqrt{n}(S_{n}(\mathbf {\beta }_{0})-T_{n}(\mathbf {\beta }_{0}))$ converges to $\mathbf {0}$ in probability. Hence, the proof will be completed by showing that $\sqrt{n}T_{n}(\mathbf {\beta }_{0})$ converges to the intended distribution. Using the Cramér–Wold device (Serfling 1980), let

$$\begin{aligned} U=n^{-1/2}\sum _{i=1}^{n}I_{\varGamma }({\mathbf X}_{i}){\mathbf a}^{\tau }\nabla _{\mathbf {\beta }_0} [g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)]\varphi [F(\varepsilon _{i} (\mathbf {\beta }_{0}))], \end{aligned}$$

where ${\mathbf a}\in \mathbb {R}^p$. Since F is the distribution of $\varepsilon (\mathbf {\beta }_{0})$ and $\int _{0}^{1}\varphi (t)\mathrm{d}t=0$, we have $E(U)=0$. Also, since $\int _{0}^{1}\varphi ^{2}(t)\mathrm{d}t=1$,

$$\begin{aligned} \mathrm{Var}(U)=\frac{1}{n}\sum _{i=1}^{n}E(I_{\varGamma }({\mathbf X}_{i}){\mathbf a}^{\tau } \nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\big ) [\nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0) \big )]^{\tau }{\mathbf a}\ \rightarrow \ {\mathbf a}^{\tau }\varSigma {\mathbf a}\quad a.s. \end{aligned}$$

Note that U is the sum of independent functions of random variables which are not necessarily identically distributed; hence, we need to establish the limit distribution by the Lindeberg–Feller central limit theorem. To this end, set $\sigma _{n}^{2}=\mathrm{Var}(U)$. Defining $A_{n}$ by

$$\begin{aligned} A_{n}=\frac{1}{\sqrt{n}}I_{\varGamma }({\mathbf X}_{i})\nabla _{\mathbf {\beta }_0} \big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\big )[\nabla _{\mathbf {\beta }_0} \big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\big )]^{\tau }\varphi [F(\varepsilon _{i}(\mathbf {\beta }_{0}))], \end{aligned}$$

we need to show that

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{\sigma _{n}^{2}}\sum _{i=1}^{n}E[A_{n}^{2}I\left\{ |A_{n}|>\varepsilon \sigma _{n}\right\} ]=0. \end{aligned}$$

(20)

By assumption $(I_3)$–(iii), $\Vert \nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta })\Vert \le J({\mathbf X}_{i})$ and so,

$$\begin{aligned} \frac{1}{\sqrt{n}}\left| {\mathbf a}^{\tau } \nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0} ({\mathbf X}_{i}^{\tau }\mathbf {\beta })\right| \le \frac{1}{\sqrt{n}}J({\mathbf X}_{i})\Vert {\mathbf a}\Vert . \end{aligned}$$

$J(\cdot )$ being integrable, is almost surely bounded. Thus, there exists a positive constant c such that $J({\mathbf X}_{i})\le c\;a.s.$, and therefore, $n^{-1/2}|{\mathbf a}^{\tau } \nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|\le n^{-1/2}c\Vert {\mathbf a}\Vert \;a.s.$ Hence,

$$\begin{aligned} \frac{1}{\sqrt{n}}|{\mathbf a}^{\tau } \nabla _{\mathbf {\beta }_0} \big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta })|\rightarrow 0\;a.s.\quad as\;\;n\rightarrow \infty . \end{aligned}$$

Set $\lambda _{n}=n^{-1/2}c\Vert {\mathbf a}\Vert $. Then, $\lambda _{n}\rightarrow 0$ as $n\rightarrow \infty $, and is independent of i. Since $\sigma ^{2}_{n}$ converges to a positive quantity, the ratio $\sigma _{n}/\lambda _{n}\rightarrow \infty $ as $n\rightarrow \infty $. Now conditioning on ${\mathbf X}_{i}$, it is easy to see that

$$\begin{aligned} E[A_{n}^{2}I\{|A_{n}|>\varepsilon \sigma _{n}\}]\le & {} E\Big [\varphi ^{2} [F(\varepsilon (\mathbf {\beta }_{0}))]I\Big (\big |\varphi [F(\varepsilon (\mathbf {\beta }_{0}))]\big |>\varepsilon \sigma _{n}/\lambda _{n}\Big )\Big ]\\&\times \frac{1}{n}\sum _{i=1}^{n}E\{I_{\varGamma }({\mathbf X})\nabla _{\mathbf {\beta }_0} \big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\big )[\nabla _{\mathbf {\beta }_0} \big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\big )]^{\tau }\}. \end{aligned}$$

In this expression, $\lim _{n \rightarrow \infty } n^{-1}\sum _{i=1}^{n}E\{I_{\varGamma }({\mathbf X})\nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\big )[\nabla _{\mathbf {\beta }_0}\big (g_{\mathbf {\beta }_0}({\mathbf X}_{i}^{\tau }\mathbf {\beta }_0)\big )]^{\tau }\}< \infty $ by $(I_2)$–(iii), $(I_4)$ and $(I_6)$. From the boundedness of $\varphi $ and applying the dominated convergence theorem, we have

$$\begin{aligned} E\Big [\varphi ^{2}[F(\varepsilon (\mathbf {\beta }_{0}))]I\Big (\big |\varphi [F(\varepsilon (\mathbf {\beta }_{0}))]\big |>\varepsilon \sigma _{n}/\lambda _{n}\Big )\Big ] \rightarrow 0\quad as\quad n\rightarrow \infty . \end{aligned}$$

This shows that the limit in (20) goes to zero as $n\rightarrow \infty $. $\square $

Proof of Theorem 10

Recall that from Eq. (2), for any ${\mathbf X}_{i}\in \varGamma $,

$$\begin{aligned} D_{n}(\mathbf {\beta })=\frac{1}{n}\sum _{i=1}^{n}\varphi \Big (\frac{R(z_{i}(\mathbf {\beta }))}{n+1}\Big )z_{i}(\mathbf {\beta })=\frac{1}{n}\sum _{i=1}^{n}\varphi \Big (\frac{i}{n+1}\Big ) z_{(i)}(\mathbf {\beta }), \end{aligned}$$

where $z_{(1)}(\mathbf {\beta })\le z_{(2)}(\mathbf {\beta })\le \cdots \le z_{(n)}(\mathbf {\beta })$. Since R(t) is a step function, it has a finite number of jumps. The set of such jumps is finite and therefore has a zero probability. Since $g_{\mathbf {\beta }}(\cdot )$ is assumed to be three times continuously differentiable by $(I_2)$–(iii), $D_{n}(\mathbf {\beta })$ is almost surely differentiable. From this, taking into account Theorem 6 and expanding $D_{n}(\mathbf {\beta })$ around $\mathbf {\beta }_{0}$ up to order 2, we have with probability 1,

$$\begin{aligned} D_{n}(\mathbf {\beta })=D_{n}(\mathbf {\beta }_0)+(\mathbf {\beta }-\mathbf {\beta }_0)S_{n} (\mathbf {\beta }_0)+\frac{1}{2}(\mathbf {\beta }-\mathbf {\beta }_{0})^{\tau }\nabla _{\mathbf {\beta }} T_{n}(\mathbf {\xi })(\mathbf {\beta }-\mathbf {\beta }_0)+o(1), \end{aligned}$$

where $\mathbf {\xi }=\lambda \mathbf {\beta }_{0}+(1-\lambda )\mathbf {\beta }$, for $\lambda \in (0,1)$. Thus,

$$\begin{aligned} M_{n}(\mathbf {\beta })-D_{n}(\mathbf {\beta })= & {} (\mathbf {\beta }-\mathbf {\beta }_{0})^\tau \nabla _{\mathbf {\beta }_0}T_n(\mathbf {\beta }_0) (\mathbf {\beta }-\mathbf {\beta }_{0}) \\&-\frac{1}{2}(\mathbf {\beta }-\mathbf {\beta }_{0})^{\tau }\nabla _{\mathbf {\beta }} T_{n}(\mathbf {\xi })(\mathbf {\beta }-\mathbf {\beta }_0)+o(1). \end{aligned}$$

From this, we have

$$\begin{aligned} |M_{n}(\mathbf {\beta })-D_{n}(\mathbf {\beta })|\le & {} \Vert \nabla _{\mathbf {\beta }_0} T_n(\mathbf {\beta }_0)\Vert \Vert \mathbf {\beta }-\mathbf {\beta }_{0}\Vert ^{2} +\frac{1}{2}\Vert \nabla _{\mathbf {\beta }}T_{n}(\mathbf {\xi })\Vert \Vert \mathbf {\beta } -\mathbf {\beta }_{0}\Vert ^{2}+o(1)\\= & {} \left\{ \Vert \nabla _{\mathbf {\beta }_0}T_n(\mathbf {\beta }_0)\Vert +\frac{1}{2}\Vert \nabla _{\mathbf {\beta }}T_{n}(\mathbf {\xi })\Vert \right\} \Vert \mathbf {\beta }-\mathbf {\beta }_{0}\Vert ^{2}+o(1)\\\le & {} \frac{3L}{2}\Vert \mathbf {\beta }-\mathbf {\beta }_{0}\Vert ^{2}\frac{1}{n} \sum _{i=1}^{n}[J({\mathbf X}_{i})+J^{2}({\mathbf X}_{i})]+o(1) . \end{aligned}$$

as $\Vert \nabla _{\mathbf {\beta }_0}T_n(\mathbf {\beta }_0)\Vert $ and $\Vert \nabla _{\mathbf {\beta }}T_{n}(\mathbf {\xi })\Vert $ are bounded by $L n^{-1}\sum _{i=1}^{n}[J({\mathbf X}_{i})+J^{2}({\mathbf X}_{i})]$. On the other hand, $n^{-1}\sum _{i=1}^{n}[J({\mathbf X}_{i})+J^{2}({\mathbf X}_{i})]\rightarrow E\big [J({\mathbf X})+J^{2}({\mathbf X})\big ]<\infty \;a.s.$, by assumption $(I_2)$–(iii) and $(I_4)$. Now, for any $\mathbf {\beta }\in \mathscr {B}_{n}$, $\Vert \mathbf {\beta }-\mathbf {\beta }_{0}\Vert \le c/\sqrt{n}$. This implies that

$$\begin{aligned} \sup _{\mathbf {\beta }\in \mathscr {B}_{n}}|M_{n}(\mathbf {\beta })-D_{n}(\mathbf {\beta })|\le \frac{3c^{2}L}{2n}\frac{1}{n}\sum _{i=1}^{n}[J({\mathbf X}_{i})+J^{2}({\mathbf X}_{i})]+o(1). \end{aligned}$$

By Markov’s inequality, we have for any $\varepsilon >0$ and for n large enough,

$$\begin{aligned} P_{\mathbf {\beta }_{0}}\Big [\sup _{\mathbf {\beta }\in \mathscr {B}_{n}}|D_{n} (\mathbf {\beta })-M_{n}(\mathbf {\beta })|>\varepsilon \Big ]\le & {} \frac{1}{\varepsilon } E\left[ \sup _{\mathbf {\beta }\in \mathscr {B}_{n}}|M_{n}(\mathbf {\beta })-D_{n} (\mathbf {\beta })|\right] \\\le & {} \frac{3c^{2}L}{2n\varepsilon }E\left\{ \frac{1}{n}\sum _{i=1}^{n} [J({\mathbf X}_{i})+J^{2}({\mathbf X}_{i})]\right\} . \end{aligned}$$

A direct application of the dominated convergence theorem gives

$$\begin{aligned} \lim _{n\rightarrow \infty }E\left\{ \frac{1}{n}\sum _{i=1}^{n}[J({\mathbf X}_{i}) +J^{2}({\mathbf X}_{i})]\right\} \rightarrow E\left\{ [J({\mathbf X})+J^{2}({\mathbf X})]\right\} <\infty . \end{aligned}$$

Thus, $\displaystyle \lim _{n\rightarrow \infty } P_{\mathbf {\beta }_{0}}\Big [\sup _{\mathbf {\beta }\in \mathscr {B}_{n}}|D_{n}(\mathbf {\beta })-M_{n}(\mathbf {\beta })|>\varepsilon \Big ]=0$. The proof of Eq. (8) is obtained similarly, while that of Eq. (7) is obtained by combining Eq. (6) and Theorem 1. $\square $

Proof of Theorem 11

Equation (12) gives $\sqrt{n}\big (\tilde{\mathbf {\beta }}_{n}-\mathbf {\beta }_{0}\big ) = -\widetilde{{\mathbf W}}_{n}^{-1}\sqrt{n}\widetilde{S}_{n}(\mathbf {\beta }_{0}) + o_p(1)$ and by (9) we have $\sqrt{n}\widetilde{S}_{n}(\mathbf {\beta }_{0}) = \sqrt{n}S_{n}(\mathbf {\beta }_{0}) + o_p(1)$. Moreover, $\widetilde{{\mathbf W}} = {\mathbf W}+ o_p(1)$ by (13). Since ${\mathbf W}$ is positive definite, we have $\sqrt{n}\big (\tilde{\mathbf {\beta }}_{n}-\mathbf {\beta }_{0}\big ) = -{\mathbf W}^{-1}\sqrt{n} S_{n}(\mathbf {\beta }_{0}) + o_p(1)$. The result follows by Theorem 8. $\square $

About this article

Cite this article

Bindele, H.F., Abebe, A. & Meyer, K.N. General rank-based estimation for regression single index models. Ann Inst Stat Math 70, 1115–1146 (2018). https://doi.org/10.1007/s10463-017-0618-9

Download citation

Received: 07 December 2016
Revised: 05 August 2017
Published: 20 September 2017
Issue Date: October 2018
DOI: https://doi.org/10.1007/s10463-017-0618-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

General rank-based estimation for regression single index models

Abstract

Access this article

Similar content being viewed by others

Robust adaptive model selection and estimation for partial linear varying coefficient models in rank regression

Rank method for partial functional linear regression models

Aligned rank tests in measurement error model

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Lemma 3

Proof of Lemma 1

Proof of Lemma 2

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 4

Proof of Theorem 5

Proof of Theorem 6

Proof of Theorem 7

Proof of Theorem 8

Proof of Theorem 10

Proof of Theorem 11

About this article

Cite this article

Keywords

Navigation

General rank-based estimation for regression single index models

Abstract

Access this article

Similar content being viewed by others

Robust adaptive model selection and estimation for partial linear varying coefficient models in rank regression

Rank method for partial functional linear regression models

Aligned rank tests in measurement error model

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma 3

Proof of Lemma 1

Proof of Lemma 2

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 4

Proof of Theorem 5

Proof of Theorem 6

Proof of Theorem 7

Proof of Theorem 8

Proof of Theorem 10

Proof of Theorem 11

About this article

Cite this article

Share this article

Keywords

Search

Navigation