Robust estimators in a generalized partly linear regression model under monotony constraints

Abstract

In this paper, we consider the situation in which the observations follow an isotonic generalized partly linear model. Under this model, the mean of the responses is modelled, through a link function, linearly on some covariates and nonparametrically on an univariate regressor in such a way that the nonparametric component is assumed to be a monotone function. A class of robust estimates for the monotone nonparametric component and for the regression parameter, related to the linear one, is defined. The robust estimators are based on a spline approach combined with a score function which bounds large values of the deviance. As an application, we consider the isotonic partly linear log-Gamma regression model. Under regularity conditions, we derive consistency results for the nonparametric function estimators as well as consistency and asymptotic distribution results for the regression parameter estimators. Besides, the empirical influence function allows us to study the sensitivity of the estimators to anomalous observations. Through a Monte Carlo study, we investigate the performance of the proposed estimators under a partly linear log-Gamma regression model with increasing nonparametric component. The proposal is illustrated on a real data set.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Aït Sahalia Y (1995) The delta method for nonparametric kernel functionals. Ph.D. dissertation, University of Chicago

  2. Álvarez E, Yohai J (2012) \(M\)-estimators for isotonic regression. J Stat Plan Inference 142:2241–2284

    MathSciNet  Article  Google Scholar 

  3. Bianco A, Boente G (2004) Robust estimators in semiparametric partly linear regression models. J Stat Plan Inference 122:229–252

    MathSciNet  Article  Google Scholar 

  4. Bianco A, Yohai V (1996) Robust estimation in the logistic regression model. Lecture notes in statistics, vol 109. Springer, New York, pp 17–34

    Google Scholar 

  5. Bianco A, García Ben M, Yohai V (2005) Robust estimation for linear regression with asymmetric errors. Can J Stat 33:511–528

    MathSciNet  Article  Google Scholar 

  6. Bianco A, Boente G, Rodrigues I (2013a) Resistant estimators in Poisson and Gamma models with missing responses and an application to outlier detection. J Multivar Anal 114:209–226

    MathSciNet  Article  Google Scholar 

  7. Bianco A, Boente G, Rodrigues I (2013b) Robust tests in generalized linear models with missing responses. Comput Stat Data Anal 65:80–97

    MathSciNet  Article  Google Scholar 

  8. Birke M, Dette H (2007) Testing strict monotonicity in nonparametric regression. Math Methods Stat 16:110–123

    MathSciNet  Article  Google Scholar 

  9. Boente G, Rodríguez D (2010) Robust inference in generalized partially linear models. Comput Stat Data Anal 54:2942–2966

    MathSciNet  Article  Google Scholar 

  10. Boente G, He X, Zhou J (2006) Robust estimates in generalized partially linear models. Ann Stat 34:2856–2878

    MathSciNet  Article  Google Scholar 

  11. Boente G, Rodríguez D, Vena P (2018) Robust estimators in a generalized partly linear regression model under monotony constraints. https://arxiv.org/abs/1802.07998

  12. Cantoni E, Ronchetti E (2001) Robust inference for generalized linear models. J Am Stat Assoc 96:1022–1030

    MathSciNet  Article  Google Scholar 

  13. Cantoni E, Ronchetti E (2006) A robust approach for skewed and heavy tailed outcomes in the analysis of health care expenditures. J Health Econ 25:198–213

    Article  Google Scholar 

  14. Croux C, Haesbroeck G (2002) Implementing the Bianco and Yohai estimator for logistic regression. Comput Stat Data Anal 44:273–295

    MathSciNet  Article  Google Scholar 

  15. Du J, Sun Z, Xie T (2013) \(M\)-estimation for the partially linear regression model under monotonic constraints. Stat Probab Lett 83:1353–1363

    MathSciNet  Article  Google Scholar 

  16. Gijbels I, Hall P, Jones M, Koch I (2000) Tests for monotonicity of a regression mean with guaranteed level. Biometrika 87:663–673

    MathSciNet  Article  Google Scholar 

  17. Härdle W, Liang H, Gao J (2000) Partially linear models. Physica-Verlag, Wurzburg

    Google Scholar 

  18. He X, Shi P (1996) Bivariate tensor-product \(B\)-spline in a partly linear model. J Multivar Anal 58:162–181

    MathSciNet  Article  Google Scholar 

  19. He X, Shi P (1998) Monotone B-spline smoothing. J Am Stat Assoc 93:643–650

    MathSciNet  MATH  Google Scholar 

  20. He X, Zhu Z, Fung W (2002) Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 89:579–590

    MathSciNet  Article  Google Scholar 

  21. Heritier S, Cantoni E, Copt S, Victoria-Feser MP (2009) Robust methods in biostatistics. Wiley series in probability and statistics. Wiley, New York

    Google Scholar 

  22. Huang J (2002) A note on estimating a partly linear model under monotonicity constraints. J Stat Plan Inference 107:343–351

    MathSciNet  Article  Google Scholar 

  23. Künsch H, Stefanski L, Carroll R (1989) Conditionally unbiased bounded influence estimation in general regression models with applications to generalized linear models. J Am Stat Assoc 84:460–466

    MathSciNet  MATH  Google Scholar 

  24. Lu M (2010) Spline-based sieve maximum likelihood estimation in the partly linear model under monotonicity constraints. J Multivar Anal 101:2528–2542

    MathSciNet  Article  Google Scholar 

  25. Lu M (2015) Spline estimation of generalised monotonic regression. J Nonparametr Stat 27:19–39

    MathSciNet  Article  Google Scholar 

  26. Lu M, Zhang Y, Huang J (2007) Estimation of the mean function with panel count data using monotone polynomial splines. Biometrika 94:705–718

    MathSciNet  Article  Google Scholar 

  27. Mallows C (1974) On some topics in robustness. Memorandum Bell Laboratories, Murray Hill

    Google Scholar 

  28. Manchester L (1996) Empirical influence for robust smoothing. Aust J Stat 38:275–296

    MathSciNet  Article  Google Scholar 

  29. Marazzi A, Yohai V (2004) Adaptively truncated maximum likelihood regression with asymmetric errors. J Stat Plan Inference 122:271–291

    MathSciNet  Article  Google Scholar 

  30. Maronna R, Martin D, Yohai V (2006) Robust statistics: theory and methods. Wiley, New York

    Google Scholar 

  31. McCullagh P, Nelder J (1989) Generalized linear models, 2nd edn. Champman and Hall, London

    Google Scholar 

  32. Ramsay J (1988) Monotone regression splines in action. Stat Sci 3:425–441

    Article  Google Scholar 

  33. Schumaker L (1981) Spline functions: basic theory. Wiley, New York

    Google Scholar 

  34. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    MathSciNet  Article  Google Scholar 

  35. Shen X, Wong WH (1994) Convergence rate of sieve estimates. Ann Stat 22:580–615

    MathSciNet  Article  Google Scholar 

  36. Stefanski L, Carroll R, Ruppert D (1986) Bounded score functions for generalized linear models. Biometrika 73:413–424

    MathSciNet  MATH  Google Scholar 

  37. Stone CJ (1986) The dimensionality reduction principle for generalized additive models. Ann Stat 14:590–606

    MathSciNet  Article  Google Scholar 

  38. Sun Z, Zhang Z, Du J (2012) Semiparametric analysis of isotonic errors-in-variables regression models with missing response. Commun Stat Theory Methods 41:2034–2060

    MathSciNet  Article  Google Scholar 

  39. Tamine J (2002) Smoothed influence function: another view at robust nonparametric regression. Discussion paper 62 Sonderforschungsbereich 373, Humboldt-Universität zu Berlin

  40. van der Geer S (2000) Empirical processes in \(M\)-estimation. Cambridge University Press, Cambridge

    Google Scholar 

  41. van der Vaart A (1998) Asymptotic statistics. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge

    Google Scholar 

  42. van der Vaart A, Wellner J (1996) Weak convergence and empirical processes. With applications to statistics. Springer, New York

    Google Scholar 

  43. Wang Y, Huang J (2002) Limiting distribution for monotone median regression. J Stat Plann Infer 108:281–287

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the Associate Editor and two anonymous referee for their valuable comments which led to an improved version of the original paper. This research was partially supported by Grants pip 112-201101-00742 from conicet, pict 2014-0351 from anpcyt and 20020170100022BA and 20020170100330BA from the Universidad de Buenos Aires, Argentina and also by the Spanish Project MTM2016-76969P from the Ministry of Science and Innovation, Spain.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Graciela Boente.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material.

The supplementary material (available online) contains the proof of Theorem 3 and that of the expressions given in (21) and (22) for the empirical influence function of the proposed estimators. Some additional figures for the empirical influence function given in Section 6.1 are provided. It also contains some lemmas ensuring that the entropy assumptions C4 and C5 hold, for some choices of the loss function. 1.21MB PDF

Appendix

Appendix

Throughout this section, we will denote as \(\Vert \rho \Vert _{\infty }=\sup _{y \in \mathbb {R}, u\in \mathbb {R}, a\in {\mathcal {V}}} \rho (y,u,a)\) and \(\Vert w\Vert _{\infty }=\sup _{{\mathbf {x}}\in \mathbb {R}^p} w({\mathbf {x}})\).

Proof of Theorem 1

Let \(V_{\varvec{\beta },g,a}=\rho \left( y,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }+g(t),a\right) w({\mathbf {x}}) \) and denote as P the probability measure of \((y ,{\mathbf {x}},t )\) and as \(P_n\) its corresponding empirical measure. Then, \(L_n(\varvec{\beta },g,a)=P_n V_{\varvec{\beta },g,a}\) and \(L(\varvec{\beta },g,a)=P V_{\varvec{\beta },g,a}\).

The consistency of \(\widehat{\kappa }\) entails that given any neighbourhood \({\mathcal {V}}\) of \(\kappa _0\), there exists a null set \({\mathcal {N}}_{\mathcal {V}}\), such that for \(\omega \notin {\mathcal {N}}_{\mathcal {V}}\), there exists \(n_0\in \mathbb {N}\), such that for all \(n\ge n_0\) we have that \( \widehat{\kappa }\in {\mathcal {V}}\).

The proof follows similar steps as those used in the proof of Theorem 5.7 of van der Vaart (1998). Let us begin showing that

$$\begin{aligned} A_n=\sup _{\varvec{\beta }\in \mathbb {R}^p, g\in {\mathcal {M}}_n({\mathcal {T}}_n,\ell ), a \in {\mathcal {V}}} |L_n(\varvec{\beta },g, a)-L(\varvec{\beta },g, a)| \buildrel {a.s.}\over \longrightarrow 0\,. \end{aligned}$$
(A.1)

Note that \(A_n=\sup _{f\in {\mathcal {F}}_n} (P_n-P)f\), where \({\mathcal {F}}_n\) is defined in (10). Furthermore, C1 entails that \(\sup _{f\in {\mathcal {F}}_n}|f|=\Vert \rho \Vert _\infty \Vert w\Vert _\infty \), while C4 and the fact that \(k_n = O(n^\nu )\) with \(\nu< 1/(2r)<1\) imply that

$$\begin{aligned} \frac{1}{n} \log N(\epsilon , {\mathcal {F}}_n, L_1(P_n))= O_\mathbb {P}(1)\; \frac{k_n+p}{n}\, \log \left( \frac{1}{\epsilon }\right) \buildrel {p}\over \longrightarrow 0\,. \end{aligned}$$

Hence, we get that (A.1) holds (see, for instance, exercise 3.6 in van der Geer (2000) with \(b_n=\max (1, \Vert \rho \Vert _{\infty } \Vert w\Vert _{\infty })\)).

Since \(L(\varvec{\theta }_0, \kappa _0)=\inf _{\varvec{\beta }\in \mathbb {R}^p, g\in {\mathcal {G}}}L(\varvec{\beta },g, \kappa _0)\), where \(\varvec{\theta }_0=(\varvec{\beta }_0,\eta _0)\), we have that

$$\begin{aligned} 0\le L(\widehat{\varvec{\theta }}, \kappa _0)-L(\varvec{\theta }_0, \kappa _0)= \sum _{j=1}^3 A_{n,j}\,, \end{aligned}$$
(A.2)

with \(A_{n,1}=L(\widehat{\varvec{\theta }}, \widehat{\kappa })-L_n(\widehat{\varvec{\theta }}, \widehat{\kappa })\), \(A_{n,2}=L_n(\widehat{\varvec{\theta }}, \widehat{\kappa })-L(\varvec{\theta }_0, \kappa _0)\) and \(A_{n,3}=L(\widehat{\varvec{\theta }}, \kappa _0)-L(\widehat{\varvec{\theta }}, \widehat{\kappa })\). Noting that \(|A_{n,1}|\le A_n\), we obtain that \(A_{n,1}=o_{\text {a.s.}}(1)\). On the other hand, since \(L(\widehat{\varvec{\theta }}, a)=L^{\star } (\widehat{\varvec{\beta }}, \widehat{\varvec{\lambda }}, a)\) the equicontinuity of \(L^{\star }\) stated in C1 and the consistency of \(\widehat{\kappa }\) entails that \(A_{n,3}=o_{\text {a.s.}}(1)\).

We will now bound \(A_{n,2}\). Using Lemma A1 of Lu et al. (2007), we get that there exists \(g_n\in {\mathcal {M}}_n({\mathcal {T}}_n,\ell )\) with \(\ell \ge r+2\), such that \(\Vert g_n-\eta _0\Vert _{\infty }=O(n^{-r\nu } )\), for \(1/(2r +2)< \nu < 1/(2r)\). Denote \(\varvec{\theta }_{0,n}=(\varvec{\beta }_0, g_n)\) and let \(S_{n,1}=(P_n-P)V_{\varvec{\beta }_0,g_n, \widehat{\kappa }}\) and \(S_{n,2}=L(\varvec{\theta }_{0,n}, \widehat{\kappa })-L(\varvec{\theta }_0, \kappa _0)\). Note that \(S_{n,1}\le A_n\), so that from (A.1), we get that \(S_{n,1} \buildrel {a.s.}\over \longrightarrow 0\). On the other hand, if we write \(S_{n,2}=\sum _{j=1}^2 S_{n,2}^{(j)}\) where \(S_{n,2}^{(1)}=L(\varvec{\theta }_{0,n}, \widehat{\kappa })- L(\varvec{\theta }_{0,n}, \kappa _0)\) and \(S_{n,2}^{(2)}=L(\varvec{\theta }_{0,n}, \kappa _0)-L(\varvec{\theta }_0, \kappa _0)\), the continuity of \(\rho \) together with the fact that \(\Vert g_n-\eta _0\Vert _{\infty }\rightarrow 0\) and the dominated convergence theorem entail that \(S_{n,2}^{(2)}\rightarrow 0\), while the continuity and boundedness of \(\rho \) together with the consistency of \(\widehat{\kappa }\) lead to \(S_{n,2}^{(1)}=o_{\text {a.s.}}(1)\). Hence, \(S_{n,j}=o_{\text {a.s.}}(1)\) for \(j=1,2\).

Using that \(\widehat{\varvec{\theta }}\) minimizes \(L_n\) over \(\mathbb {R}^p\times {\mathcal {M}}_n({\mathcal {T}}_n,\ell )\) we obtain that

$$\begin{aligned} A_{n,2}= L_n(\widehat{\varvec{\theta }},\widehat{\kappa })-L(\varvec{\theta }_0, \kappa _0) \le L_n(\varvec{\theta }_n, \widehat{\kappa })-L(\varvec{\theta }_0, \kappa _0)=S_{n,1}+S_{n,2} \,. \end{aligned}$$
(A.3)

Hence, from (A.2) and (A.3) and using that \(A_{n,j}=o_{\text {a.s.}}(1)\), for \(j=1,3\) and \( S_{n,j}= o_{\text {a.s.}}(1)\), for \(j=1,2\), we conclude that

$$\begin{aligned} 0\le L(\widehat{\varvec{\theta }}, \kappa _0)-L(\varvec{\theta }_0, \kappa _0)=\sum _{j=1}^3 A_{n,j}\le o_{\text {a.s.}}(1)\,, \end{aligned}$$

so \( L(\widehat{\varvec{\theta }}, \kappa _0)\rightarrow L(\varvec{\theta }_0, \kappa _0)\). The fact that \(\inf _{ {\varvec{\theta }}\in {\mathcal {A}}_\epsilon }L(\varvec{\theta },\kappa _0)>L(\varvec{\theta }_0,\kappa _0)\) entails that \(\pi (\widehat{\varvec{\theta }},\varvec{\theta }_0) \buildrel {a.s.}\over \longrightarrow 0\), concluding the proof. \(\square \)

Proof of Theorem 2

Define the functions \(M_1(s)=L(\varvec{\beta }_0+ s \varvec{\beta },\eta _0,a)\) and \(M_2(s)=L(\varvec{\beta }_0,\eta _0+sg,a) \) and note that \(M_1^{\prime }(0)= \mathbb {E}\left[ w({\mathbf {x}})\varPsi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t ) ,a) {\mathbf {x}}^{\textsc {t}}\varvec{\beta }\right] \) and \(M_2^{\prime }(0)= \mathbb {E}\left[ w({\mathbf {x}})\varPsi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t ) ,a) g(t)\right] \). When C9a) holds, we have that \(M_1(s)\) and \(M_2(s)\) have a minimum at \(s=0\), for any \(\varvec{\beta }\in \mathbb {R}^p\) and \(g\in {\mathcal {G}}\). Then, \(M_1^{\prime }(0)=0\) and \(M_2^{\prime }(0)=0\) which implies that, for any \(a\in {\mathcal {V}}\),

$$\begin{aligned}&\mathbb {E}\left[ w({\mathbf {x}})\varPsi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t ) ,a) {\mathbf {x}}\right] = {\mathbf{0}} \end{aligned}$$
(A.4)
$$\begin{aligned}&\mathbb {E}\left[ w({\mathbf {x}})\varPsi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t ) ,a) g(t)\right] = 0 \quad \text{ for } \text{ any } g\in {\mathcal {G}}\,. \end{aligned}$$
(A.5)

Clearly, (A.4) and (A.5) also hold under C9b).

To prove Theorem 2 under both sets of assumptions, we will state the common steps at the beginning and we then continue the proof when C5\(^\star \) or C5\(^{\star \star }\) hold.

We denote \( \varTheta _n = \mathbb {R}^p\times {\mathcal {M}}_n({\mathcal {T}}_n,\ell )\cap \{\varvec{\theta }=(\varvec{\beta },g)\in \varTheta : \pi (\varvec{\theta },\varvec{\theta }_0)<\epsilon _0\}\), where \(\varTheta =\mathbb {R}^p\times {\mathcal {G}}\). Note that, except for a null probability set, \(\widehat{\varvec{\theta }}\in \varTheta _n\), for n large enough. As in the proof of Theorem 1, let \(g_n\in {\mathcal {M}}_n({\mathcal {T}}_n,\ell )\) with \(\ell \ge r+2\), \(g_n(t)=\varvec{\lambda }_n^{\textsc {t}}{\mathbf {B}}(t)\), be such that \(\Vert g_n-\eta _0\Vert _{\infty }=O(n^{-r\nu } )\), for \(1/(2r +2)< \nu < 1/(2r)\) and denote \(\varvec{\theta }_{0,n} = (\varvec{\beta }_0,g_n)\).

In order to get the convergence rate of our estimator \(\widehat{\varvec{\theta }}= (\widehat{\varvec{\beta }},\widehat{\eta })\), we will apply Theorem 3.4.1 of van der Vaart and Wellner (1996). For that purpose, following the notation in that Theorem, denote as \(M(\varvec{\theta })= - L(\varvec{\theta }, \widehat{\kappa })\) and \(\mathbb {M}_n(\varvec{\theta })=- L_n(\varvec{\theta }, \widehat{\kappa })\) and for \(\varvec{\theta }\in \varTheta _n\), let \(d_n(\varvec{\theta }, \varvec{\theta }_0)= \pi _{\mathbb {P}}(\varvec{\theta }, \varvec{\theta }_0)\). Note that the function M is random, due to the nuisance parameter estimator \(\widehat{\kappa }\). Let \(\delta _n=A\Vert \eta _0-g_n\Vert _{{{\mathcal {F}}}}\), where \(A=4\,\sqrt{(C_0 /\Vert w\Vert _{\infty }+A_0)/C_0}\) with \(A_0= \Vert w\Vert _{\infty } \Vert \chi \Vert _{\infty }/2\) and \(C_0\) given in C8.

Using that \(|(L_n(\varvec{\theta }, \widehat{\kappa })- L(\varvec{\theta },\widehat{\kappa }))-(L_n(\varvec{\theta }_{0,n}, \widehat{\kappa })- L(\varvec{\theta }_{0,n},\widehat{\kappa })) |= | (\mathbb {M}_n-M)(\varvec{\theta })- (\mathbb {M}_n-M)(\varvec{\theta }_{0,n})|\), to make use of Theorem 3.4.1 of van der Vaart and Wellner (1996), we have to show that there exists a function \(\phi _n\) such that \(\phi _n(\delta )/\delta ^\nu \) is decreasing on \((\delta _n, \infty )\) for some \(\nu <2\) and that for any \(\delta >\delta _n\),

$$\begin{aligned} \sup _{ {\varvec{\theta }}\in \varTheta _{n,\delta }} L(\varvec{\theta }_{0,n}, \widehat{\kappa })- L(\varvec{\theta },\widehat{\kappa })=\sup _{ {\varvec{\theta }}\in \varTheta _{n,\delta }} M(\varvec{\theta })- M(\varvec{\theta }_{0,n})\lesssim & {} -\delta ^2 \nonumber \\ \end{aligned}$$
(A.6)
$$\begin{aligned} \mathbb {E}^{*} \sup _{ {\varvec{\theta }}\in \varTheta _{n, \delta }} \sqrt{n} \left| (L_n(\varvec{\theta }, \widehat{\kappa })- L(\varvec{\theta },\widehat{\kappa }))-(L_n(\varvec{\theta }_{0,n}, \widehat{\kappa })- L(\varvec{\theta }_{0,n},\widehat{\kappa })) \right|\lesssim & {} \phi _n(\delta ) \nonumber \\ \end{aligned}$$
(A.7)
$$\begin{aligned} d_n(\widehat{\varvec{\theta }}, \varvec{\theta }_{0,n})&\buildrel {p}\over \longrightarrow&0 \end{aligned}$$
(A.8)

where the symbol \(\lesssim \) means less or equal up to a constant, \(\mathbb {E}^{*}\) stands for the outer expectation and \(\varTheta _{n,\delta }=\{\varvec{\theta }\in \varTheta _n: \delta / 2 < d_n(\varvec{\theta },\varvec{\theta }_{0,n}) \le \delta \}\).

Assumption C8 and the fact that \(\widehat{\kappa } \buildrel {a.s.}\over \longrightarrow \kappa _0\) entails that, except for a null probability set, for any \(\varvec{\theta }\in \varTheta _n\), \(L(\varvec{\theta }, \widehat{\kappa })-L(\varvec{\theta }_0, \widehat{\kappa })\ge C_0\,\pi _{\mathbb {P}}^2(\varvec{\theta },\varvec{\theta }_0)\). Besides, (A.5) entails that \(\mathbb {E}\left[ w({\mathbf {x}})\varPsi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t ),a) \left( g_n(t )-\eta _0(t )\right) \right] =0\), so

$$\begin{aligned} L(\varvec{\theta }_{0,n}, a)-L(\varvec{\theta }_0, a)= & {} \mathbb {E}\left[ w({\mathbf {x}})\varPsi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t ),a) \left( g_n(t )-\eta _0(t )\right) \right] \\&+\, \frac{1}{2}\; \mathbb {E}\left[ w({\mathbf {x}})\, \chi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\widetilde{\eta }(t ), a) \left( g_n(t )-\eta _0(t )\right) ^2 \right] \\= & {} \frac{1}{2}\; \mathbb {E}\left[ w({\mathbf {x}})\,\chi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\widetilde{\eta }(t ), a) \left( g_n(t )-\eta _0(t) \right) ^2 \right] \\\le & {} \frac{1}{2}\; \Vert w\Vert _{\infty } \Vert \chi \Vert _{\infty }\mathbb {E}\left( g_n(t)-\eta _0(t) \right) ^2 = A_0\, \Vert g_n-\eta _0\Vert _2^2 \\\le & {} A_0\, \Vert g_n-\eta _0\Vert _{ {\mathcal {F}}}^2=O(n^{-2\,r\nu } )\,, \end{aligned}$$

where \(A_0= \Vert w\Vert _{\infty } \Vert \chi \Vert _{\infty }/2\) and \(\widetilde{\eta }(t)\) is an intermediate value between \(\eta _0(t)\) and \(g_n(t)\). Thus, using that \(d_n^2(\varvec{\theta },\varvec{\theta }_{0,n})\le 2 d_n^2(\varvec{\theta },\varvec{\theta }_0)+ 2 d_n^2(\varvec{\theta }_{0,n},\varvec{\theta }_0) \le 2 d_n^2(\varvec{\theta },\varvec{\theta }_0) + 2 \Vert w\Vert _{\infty }\,\Vert g_n-\eta _0\Vert _{2}^2 \le 2 d_n^2(\varvec{\theta },\varvec{\theta }_0) + 2 \Vert w\Vert _{\infty }\,\Vert g_n-\eta _0\Vert _{{{\mathcal {F}}}}^2\) and that \(\delta / 2 < d_n(\varvec{\theta },\varvec{\theta }_{0,n}) \), we obtain that

$$\begin{aligned} L(\varvec{\theta }, \widehat{\kappa })- L(\varvec{\theta }_{0,n}, \widehat{\kappa })\ge & {} C_0\,d_n^2(\varvec{\theta },\varvec{\theta }_0)- A_0\, \Vert g_n-\eta _0\Vert _{{{\mathcal {F}}}}^2 \ge \frac{C_0}{2} d_n^2(\varvec{\theta },\varvec{\theta }_{0,n}) \\&-\, \left( \frac{C_0}{\Vert w\Vert _{\infty }}+A_0\right) \Vert g_n-\eta _0\Vert _{{{\mathcal {F}}}}^2 \\\ge & {} \frac{C_0}{8} \delta ^2 - \frac{1}{A^2} \left( \frac{C_0}{\Vert w\Vert _{\infty }}+A_0\right) \delta _n^2 =\frac{C_0}{8} \delta ^2- \frac{C_0}{16} \delta _n^2\ge \frac{C_0}{16} \delta ^2\,, \end{aligned}$$

concluding the proof of (A.6).

We have now to find \(\phi _n(\delta )\) such that \(\phi _n(\delta )/\delta \) is decreasing in \(\delta \) and (A.7) holds. Note that from the consistency of \(\widehat{\kappa }\), we have that with probability one for n large enough

$$\begin{aligned}&\sqrt{n} \left| (L_n(\varvec{\theta }, \widehat{\kappa })- L(\varvec{\theta },\widehat{\kappa }))\right. \left. -\;(L_n(\varvec{\theta }_{0,n}, \widehat{\kappa })- L(\varvec{\theta }_{0,n},\widehat{\kappa })) \right| \\&\quad \le \sup _{ a \in {\mathcal {V}}} \sqrt{n} \left| (L_n(\varvec{\theta }, a)- L(\varvec{\theta },a))-(L_n(\varvec{\theta }_{0,n}, a)- L(\varvec{\theta }_{0,n}, a)) \right| . \end{aligned}$$

Define the class of functions

$$\begin{aligned} {\mathcal {F}}_{n,\delta }= & {} \left\{ V_{{\varvec{\theta }}, a}-V_{{\varvec{\theta }}_{0,n}, a}: \frac{\delta }{2} \le d_n(\varvec{\theta },\varvec{\theta }_{0,n}) \le \delta \,, \varvec{\theta }\in \varTheta _n\,, \, a\in {\mathcal {V}}\right\} \\= & {} \{V_{{\varvec{\theta }}, a}-V_{{\varvec{\theta }}_{0,n}, a}: \varvec{\theta }\in \varTheta _{n,\delta }\,, \, a\in {\mathcal {V}}\}\,, \end{aligned}$$

with \(V_{{\varvec{\theta }}, a}=\rho \left( y,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }+g(t),a\right) w({\mathbf {x}}) \), for \(\varvec{\theta }=(\varvec{\beta },g)\). Inequality (A.7) involves an empirical process indexed by \({\mathcal {F}}_{n,\delta }\), since

$$\begin{aligned}&\mathbb {E}^{*} \sup _{ {\varvec{\theta }}\in \varTheta _{n,\delta }} \sqrt{n} \left| (L_n(\varvec{\theta }, \widehat{\kappa })- L(\varvec{\theta },\widehat{\kappa }))-(L_n(\varvec{\theta }_{0,n}, \widehat{\kappa })- L(\varvec{\theta }_{0,n},\widehat{\kappa })) \right| \\&\quad \le \mathbb {E}^{*} \sup _{f\in {\mathcal {F}}_{n,\delta }} \sqrt{n} |(P_n-P) f|\,. \end{aligned}$$

For any \(f\in {\mathcal {F}}_{n,\delta } \), we have that \(\Vert f\Vert _{\infty } \le A_1 = 2 \Vert \rho \Vert _{\infty } \Vert w\Vert _{\infty }\). Furthermore, if \(A_2= \Vert \psi \Vert _{\infty } \Vert w\Vert _{\infty }\) using that

$$\begin{aligned} |V_{{\varvec{\theta }}, a}-V_{{\varvec{\theta }}_{0,n}, a}| \le \Vert \psi \Vert _{\infty } w({\mathbf {x}}) |{\mathbf {x}}^{\textsc {t}}(\varvec{\beta }-\varvec{\beta }_{0}) + g(t)-g_n(t)|\,, \end{aligned}$$

and the fact that \(\pi _{\mathbb {P}}(\varvec{\theta },\varvec{\theta }_{0,n})=d_n(\varvec{\theta },\varvec{\theta }_{0,n})\le \delta \), we get that

$$\begin{aligned} P f^2\le \Vert \psi \Vert _{\infty } \mathbb {E}\left( w^2({\mathbf {x}}) \left[ {\mathbf {x}}^{\textsc {t}}(\varvec{\beta }-\varvec{\beta }_{0})+g(t)-g_{n}(t)\right] ^2\right) \le A_2\, \pi _{\mathbb {P}}^2(\varvec{\theta },\varvec{\theta }_{0,n})\le A_2\, \delta ^2\,. \end{aligned}$$

Lemma 3.4.2 van der Vaart and Wellner (1996) leads to

$$\begin{aligned} \mathbb {E}^{*} \sup _{f\in {\mathcal {F}}_{n,\delta }} \sqrt{n} |(P_n-P) f|\le J_{[\;]}\left( A_2^{1/2}\delta ,{\mathcal {F}}_{n,\delta }, L_2(P)\right) \left( 1+ A_1 \frac{J_{[\;]}(A_2^{1/2}\,\delta ,{\mathcal {F}}_{n,\delta }, L_2(P))}{A_2 \delta ^2 \; \sqrt{n}} \right) \,, \end{aligned}$$

where \(J_{[\;]}(\delta , {\mathcal {F}}, L_2(P)) =\int _0^\delta \sqrt{1+ \log N_{[\;]}(\epsilon , {\mathcal {F}}, L_2(P)) } \mathrm{d}\epsilon \) is the bracketing integral.

(a) Assume now that C5\(^{\star }\) holds and note that for any \(\varvec{\theta }=(\varvec{\beta },g) \in \varTheta _{n,\delta }\), g can be written as \(g=\varvec{\lambda }^{\textsc {t}}{\mathbf {B}}\) for some \(\varvec{\lambda }\in {\mathcal {L}}_{k_n}\), so

$$\begin{aligned} d_n^2(\varvec{\theta }, \varvec{\theta }_{0,n})= & {} \mathbb {E}\left( w({\mathbf {x}}) \left[ {\mathbf {x}}^{\textsc {t}}(\varvec{\beta }-\varvec{\beta }_{0})+(\varvec{\lambda }-\varvec{\lambda }_n)^{\textsc {t}}{\mathbf {B}}(t)\right] ^2\right) \,. \end{aligned}$$

Hence, \({\mathcal {F}}_{n,\delta }\subset {\mathcal {G}}_{n,c, {\varvec{\lambda }}_n}\) with \(c= \delta \) and the bound given in C5\(^{\star }\) leads to

$$\begin{aligned} N_{[\;]}\left( \epsilon ,{\mathcal {F}}_{n,\delta }, L_2(P)\right) \le C_2 \left( \frac{\delta }{\epsilon }\right) ^{k_n+p+1}\,. \end{aligned}$$

This implies that

$$\begin{aligned} J_{[\;]}( A_2^{1/2}\delta ,{\mathcal {F}}_{n,\delta }, L_2(P)) \lesssim \delta \sqrt{k_n+p+1}\,. \end{aligned}$$

If we denote \(q_n = k_n + p+1\), we obtain that for some constant \(A_3\) independent of n and \(\delta \),

$$\begin{aligned} \mathbb {E}^{*} \sup _{{\varvec{\theta }}\in \varTheta _{n,\delta }} |\mathbb {G}_n V_{{\varvec{\theta }}_{0,n}, \kappa _0}-\mathbb {G}_nV_{{\varvec{\theta }}, \kappa _0}| \le A_3\,\left[ \delta \, q_n^{1/2} + \frac{ q_n }{ \sqrt{n}}\right] \,. \end{aligned}$$

Choosing

$$\begin{aligned} \phi _n(\delta )=\delta \, q_n^{1/2} + \frac{ q_n }{ \sqrt{n}} \,, \end{aligned}$$

we have that \(\phi _n(\delta )/\delta \) is decreasing in \(\delta \), concluding the proof of (A.7). The fact that \(\pi (\widehat{\varvec{\theta }}, \varvec{\theta }_0) \buildrel {a.s.}\over \longrightarrow 0\) entails that \(\pi _\mathbb {P}(\widehat{\varvec{\theta }}, \varvec{\theta }_0) \buildrel {a.s.}\over \longrightarrow 0\) which together with \(\pi _\mathbb {P}(\varvec{\theta }_{0,n}, \varvec{\theta }_0)\rightarrow 0\), leads to (A.8).

Let \(\gamma _n= O(n^{\min (r\nu ,(1-\nu )/2)})\), then \(\gamma _n \lesssim \delta _n^{-1}\), where \(\delta _n=A\Vert \eta _0-g_n\Vert _{{{\mathcal {F}}}}=O(n^{-r\nu })\). We have to show that \(\gamma _n^2\phi _n \left( 1/{\gamma _n}\right) \lesssim \sqrt{n}\). Note that

$$\begin{aligned} \gamma _n^2\phi _n \left( \frac{1}{\gamma _n}\right) =\gamma _n q_n^{1/2}+ \gamma _n^2\, \frac{ q_n }{\sqrt{n}} =\sqrt{n}\; a_n(1+a_n)\,\,, \end{aligned}$$

where \(a_n=\gamma _n q_n^{1/2}/\sqrt{n}\). Hence, to derive that \(\gamma _n^2\phi _n \left( 1/{\gamma _n}\right) \lesssim \sqrt{n}\), it is enough to show that \(a_n=O(1)\), which follows easily since \(k_n=O(n^{\nu })\) and \(\gamma _n= O(n^\varsigma )\) with \(\varsigma =\min (r\nu ,(1-\nu )/2)\).

Finally, the condition \(\mathbb {M}_n(\widehat{\varvec{\theta }})\ge \mathbb {M}_n(\varvec{\theta }_{0,n})-O_{\mathbb {P}}(\gamma _n^{-2})\) required by Theorem 3.4.1 of van der Vaart and Wellner (1996) is trivially fulfilled because \(\widehat{\varvec{\theta }}_n\) minimizes \(L_n(\varvec{\theta }, \widehat{\kappa })\). Hence, we get that \(\gamma _n^2 d_n^2(\varvec{\theta }_{0,n},\widehat{\varvec{\theta }}) = O_{\mathbb {P}}(1)\).

On the other hand, \(d_n(\varvec{\theta }_{0,n},\varvec{\theta }_0)\le \Vert w\Vert _{\infty }^{1/2} \Vert g_n-\eta _0\Vert _{\infty }=O(n^{-r\nu })\le \gamma _n\), which together with \(\gamma _n^2 d_n^2(\varvec{\theta }_{0,n},\widehat{\varvec{\theta }}) = O_{\mathbb {P}}(1)\) and the triangular inequality leads to \(\gamma _n^2 d_n^2(\varvec{\theta }_{0},\widehat{\varvec{\theta }}) = O_{\mathbb {P}}(1)\), concluding the proof.

(b) We will assume now that C5\(^{\star \star }\) holds. Therefore, using that any \(f\in {\mathcal {F}}_{n,\delta }\) can be written as \(f=f_1-f_2\) with \(f_j\in {\mathcal {F}}_{n,\epsilon _0}^\star \) and the bound given in C5\(^{\star \star }\), we get that

$$\begin{aligned} N_{[\;]}\left( \epsilon ,{\mathcal {F}}_{n,\delta }, L_2(P)\right) \le C_2^2 \frac{1}{\epsilon ^{2(k_n+p+1)}}\,. \end{aligned}$$

This implies that

$$\begin{aligned} J_{[\;]}( A_2^{1/2}\delta ,{\mathcal {F}}_{n,\delta }, L_2(P)) \lesssim \delta \log \left( \frac{1}{\delta }\right) \sqrt{k_n+p+1}\,. \end{aligned}$$

If we denote \(q_n = k_n + p+1\), we obtain

$$\begin{aligned} \mathbb {E}\sup _{{\varvec{\theta }}\in \varTheta _{n,\delta }} |\mathbb {G}_n V_{{\varvec{\theta }}_{0,n}, \kappa _0}-\mathbb {G}_nV_{{\varvec{\theta }}, \kappa _0}| \le A\left( q_n^{1/2} \delta \log \left( \frac{1}{\delta }\right) + n^{-1/2} q_n \left[ \log \left( \frac{1}{\delta }\right) \right] ^2 \right) \,. \end{aligned}$$

Choosing

$$\begin{aligned} \phi _n(\delta )=q_n^{1/2} \delta \log \left( \frac{1}{\delta }\right) + n^{-1/2} q_n \left[ \log \left( \frac{1}{\delta }\right) \right] ^2 \,, \end{aligned}$$

we have that \(\phi _n(\delta )/\delta \) is decreasing in \(\delta \).

Therefore, from Theorem 3.4.1 of van der Vaart and Wellner (1996), we conclude that \(\gamma _n^2 d_n^2(\varvec{\theta }_{0,n},\widehat{\varvec{\theta }}) = O_{\mathbb {P}}(1)\), where \(\gamma _n\) is any sequence satisfying \(\gamma _n \lesssim \delta _n^{-1}\) with \(\delta _n=\pi (\varvec{\theta }_0,\varvec{\theta }_{0,n} )=O(n^{-r\nu })\) and \(\gamma _n^2\phi _n \left( {1}/{\gamma _n}\right) \le \sqrt{n}\). The first condition entails that \(\gamma _n \le O(n^{r\nu })\). The second one implies that

$$\begin{aligned} \gamma _n^2 \left( q_n^{1/2}\gamma _n^{-1} \log (\gamma _n)+ q_n n^{-1/2}[\log (\gamma _n)]^2 \right) \le n^{1/2}\,, \end{aligned}$$

so using that \(k_n=O(n^{\nu })\) we get that \(\gamma _n \log (\gamma _n)\le O(n^{(1-\nu )/2})\). Finally, the condition \(\mathbb {M}_n(\widehat{\varvec{\theta }})\ge \mathbb {M}_n(\theta _0)-O_{\mathbb {P}}(r_n^{-2})\) required by Theorem 3.4.1 of van der Vaart and Wellner (1996) is trivially fulfilled because \(\widehat{\varvec{\theta }}_n\) minimizes \(L_n(\varvec{\theta }, \widehat{\kappa })\).

On the other hand, \(d_n(\varvec{\theta }_{0,n},\varvec{\theta }_0)\le \Vert w\Vert _{\infty }^{1/2} \Vert g_n-\eta _0\Vert _{\infty }=O(n^{-r\nu })\le \gamma _n\), which together with \(\gamma _n^2 d_n^2(\varvec{\theta }_{0,n},\widehat{\varvec{\theta }}) = O_{\mathbb {P}}(1)\) and the triangular inequality leads to \(\gamma _n^2 d_n^2(\varvec{\theta }_{0},\widehat{\varvec{\theta }}) = O_{\mathbb {P}}(1)\). \(\square \)

Conditions guaranteeing C8

The following lemma provides conditions to ensure that C8 holds.

Lemma 1

Assume that C9 holds and that \(\rho (y,u,a)\) is twice continuously differentiable with respect to u.

  1. (a)

    If the function \(\chi \left( y,u, a\right) ={\partial ^2 \rho (y,u,a)}/{\partial u^2} \) is such that there exists \(\epsilon _0>0\) and a neighbourhood \({\mathcal {V}}\) of \(\kappa _0\) such that

    $$\begin{aligned} C_0= \inf _{a \in {\mathcal {V}}}\inf _{ \begin{array}{c} \pi ^2({\varvec{\theta }},{\varvec{\theta }}_0 )<\epsilon _0\\ {\varvec{\theta }}\in \mathbb {R}^p\times {\mathcal {G}} \end{array}}\inf _{({\mathbf {x}}_0,t_0) \in {\mathcal {S}}_w \times [0,1]} \mathbb {E}\left( \chi \left( y,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }+g(t), a\right) \left| ({\mathbf {x}},t)=({\mathbf {x}}_0,t_0)\right. \right) >0\,, \end{aligned}$$
    (A.9)

    where \({\mathcal {S}}_w\) stands for the support of the function w, then C8 holds.

  2. (b)

    If \(\pi ^2(\varvec{\theta }_1 ,\varvec{\theta }_2 ) =\Vert \varvec{\beta }_1-\varvec{\beta }_2 \Vert ^2+ \Vert \eta _1-\eta _2\Vert ^2_{\infty }\), C8 holds if w has bounded support \({\mathcal {S}}_w\subset \{\Vert {\mathbf {x}}\Vert \le A_1\}\) or \(\mathbb {P}(\Vert {\mathbf {x}}\Vert \le A_1)=1\) and for some positive constant \(A_2\)

    $$\begin{aligned} C_0= \inf _{a \in {\mathcal {V}}}\inf _{({\mathbf {x}}_0,t_0) \in {\mathcal {S}}_w \times [0,1]}\inf _{|s-s_0|<A_2} \mathbb {E}\left( \chi \left( y,s, a\right) \left| ({\mathbf {x}},t)=({\mathbf {x}}_0,t_0)\right. \right) >0\,, \end{aligned}$$
    (A.10)

    where \(s_0= {\mathbf {x}}_0^{\textsc {t}}\varvec{\beta }_0+\eta _0(t_0)\).

Note that (A.9) is the robust counterpart of the assumption that the conditional variance of \(y|({\mathbf {x}},t)\) is bounded away from 0 used in Theorem 1 in Lu (2015). Assumption (A.10) is fulfilled, for instance, if \(\mathbb {E}\left( \chi \left( y,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t), a\right) \left| ({\mathbf {x}},t)=({\mathbf {x}}_0,t_0)\right. \right) >0\) and the function \(\chi (y,s,a)\) is continuous in all its arguments. These two conditions hold for instance, under partial linear model (6) both for symmetric errors or with errors having a density (8), when the functions \(\phi \) and \(\upsilon \) satisfy the assumptions N3 and N5 needed to derive the asymptotic normality of the regression estimators \(\widehat{\varvec{\beta }}\).

Proof of Lemma 1

For any \(\varvec{\theta }\in \mathbb {R}^p\times {\mathcal {M}}_n({\mathcal {T}}_n, \ell )\), denote as \(M_{{\varvec{\theta }}}(s)= L(\varvec{\theta }_0+s (\varvec{\theta }-\varvec{\theta }_0),a)\), then \(M_{{\varvec{\theta }}}(1)=L(\varvec{\theta },a)\) and \(M_{{\varvec{\theta }}}(0)=L(\varvec{\theta }_0,a)\). Furthermore, denoting \(b({\mathbf {x}},t)={\mathbf {x}}^{\textsc {t}}(\varvec{\beta }-\varvec{\beta }_0)+g(t )-\eta _0(t )\), we have

$$\begin{aligned} M_{{\varvec{\theta }}}^{\prime }(s)= & {} \mathbb {E}\left[ w({\mathbf {x}})\varPsi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t )+\,s\, b({\mathbf {x}},t),a) b({\mathbf {x}},t)\right] \\ M_{{\varvec{\theta }}}^{\prime \,\prime }(s)= & {} \mathbb {E}\left[ w({\mathbf {x}})\, \chi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_0+\eta _0(t)+\,s\, b({\mathbf {x}},t), a) b^2({\mathbf {x}},t) \right] \,. \end{aligned}$$

Assumption C9 implies that \(M_{{\varvec{\theta }}}^{\prime }(0)=0\), hence, a Taylor’s expansion of order two entails that for some \(0<\xi <1\), \(M_{{\varvec{\theta }}}(1)-M_{{\varvec{\theta }}}(0)= M_{{\varvec{\theta }}}^{\prime \,\prime }(\xi )/2\).

  1. (a)

    Denote as \(\varvec{\beta }_\xi =\varvec{\beta }_0+\xi (\varvec{\beta }-\varvec{\beta }_0)\) and \(g_\xi =\eta _0+ \xi (g-\eta _0)=(1-\xi )\eta _0+\xi g\), then \(\varvec{\theta }_\xi =(\varvec{\beta }_\xi ,g_\xi )\in \varTheta \) for \(g\in {\mathcal {G}}\) and \(\pi (\varvec{\theta }_\xi ,\varvec{\theta }_0)=\xi \pi (\varvec{\theta }, \varvec{\theta }_0)\). Therefore, for \(a\in {\mathcal {V}}\), and \(\varvec{\theta }\in \mathbb {R}^p\times {\mathcal {M}}_n({\mathcal {T}}_n, \ell )\), such that \(\pi (\varvec{\theta },\varvec{\theta }_0)<\epsilon _0\), we have that

    $$\begin{aligned}&L(\varvec{\theta }, a)-L(\varvec{\theta }_0, a)\\&\quad = M_{{\varvec{\theta }}}(1)-M_{{\varvec{\theta }}}(0)=\frac{1}{2}\; \mathbb {E}\left[ w({\mathbf {x}})\, \chi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_\xi + g_\xi (t ), a) \, b^2({\mathbf {x}},t) \right] \\&\quad = \frac{1}{2}\; \mathbb {E}\left[ w({\mathbf {x}})\, \mathbb {E}\left\{ \chi (y ,{\mathbf {x}}^{\textsc {t}}\varvec{\beta }_\xi + g_\xi (t ), a) \Big | ({\mathbf {x}},t)\right\} \, b^2({\mathbf {x}},t) \mathbb {I}_{{\mathcal {S}}_{w}\times [0,1]}({\mathbf {x}},t) \right] \\&\quad \ge C_0 \mathbb {E}w({\mathbf {x}}) b^2({\mathbf {x}},t)= C_0 \pi _{\mathbb {P}}^2(\varvec{\theta }, \varvec{\theta }_0) \end{aligned}$$

    where we have used that \(\pi (\varvec{\theta }_\xi ,\varvec{\theta }_0) <\epsilon _0\) and (A.9), concluding the proof of (a).

  2. (b)

    Assume that \(\pi ^2(\varvec{\theta }_1 ,\varvec{\theta }_2 ) =\Vert \varvec{\beta }_1-\varvec{\beta }_2 \Vert ^2+ \Vert \eta _1-\eta _2\Vert ^2_{\infty }\) and that (A.10) holds. Let \(s_0={\mathbf {x}}_0^{\textsc {t}}\varvec{\beta }_0+\eta _0(t_0)\) with \({\mathbf {x}}_0\in {\mathcal {S}}_w\). Using that \( |{\mathbf {x}}_0 ^{\textsc {t}}\varvec{\beta }_\xi + g_\xi (t )- s_0|\le A_1 \Vert \varvec{\beta }_\xi -\varvec{\beta }_0\Vert + |g_\xi (t_0 )- \eta _0(t_0)|\), we get that \(|{\mathbf {x}}_0 ^{\textsc {t}}\varvec{\beta }_\xi + g_\xi (t )- s_0|\le A_2\), whenever \(\pi (\varvec{\theta }, \varvec{\theta }_0)\le \epsilon \), with \(\epsilon _0< A_2/(1+A_1)\). The proof follows as in (a) using (A.10). \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Boente, G., Rodriguez, D. & Vena, P. Robust estimators in a generalized partly linear regression model under monotony constraints. TEST 29, 50–89 (2020). https://doi.org/10.1007/s11749-019-00629-7

Download citation

Keywords

  • B-splines
  • Deviance
  • Isotonic regression
  • Partial linear models
  • Robust estimation

Mathematics Subject Classification

  • 62F30
  • 62G35