Robust estimation in single-index models when the errors have a unimodal density with unknown nuisance parameter

Agostinelli, Claudio; Bianco, Ana M.; Boente, Graciela

doi:10.1007/s10463-019-00712-8

Robust estimation in single-index models when the errors have a unimodal density with unknown nuisance parameter

Published: 21 March 2019

Volume 72, pages 855–893, (2020)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Claudio Agostinelli¹,
Ana M. Bianco² &
Graciela Boente³

228 Accesses
2 Citations
Explore all metrics

Abstract

This paper develops a robust profile estimation method for the parametric and nonparametric components of a single-index model when the errors have a strongly unimodal density with unknown nuisance parameter. We derive consistency results for the link function estimators as well as consistency and asymptotic distribution results for the single-index parameter estimators. Under a log-Gamma model, the sensitivity to anomalous observations is studied using the empirical influence curve. We also discuss a robust K-fold cross-validation procedure to select the smoothing parameters. A numerical study carried on with errors following a log-Gamma model and for contaminated schemes shows the good robustness properties of the proposed estimators and the advantages of considering a robust approach instead of the classical one. A real data set illustrates the use of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

Overfitting, Model Tuning, and Evaluation of Prediction Performance

References

Aït Sahalia, Y. (1995). The delta method for nonaparmetric kernel functionals. Ph.D. dissertation, University of Chicago.
Bianco, A., Boente, G. (2002). On the asymptotic behavior of one-step estimation. Statistics and Probability Letters, 60, 33–47.
Article MathSciNet Google Scholar
Bianco, A., Boente, G. (2007). Robust estimators under a semiparametric partly linear autoregression model: asymptotic behavior and bandwidth selection. Journal of Time Series Analysis, 28, 274–306.
Article MathSciNet Google Scholar
Bianco, A., García Ben, M., Yohai, V. (2005). Robust estimation for linear regression with asymmetric errors. Canadian Journal Statistics, 33, 511–528.
Article MathSciNet Google Scholar
Boente, G., Fraiman, R., Meloche, J. (1997). Robust plug-in bandwidth estimators in nonparametric regression. Journal of Statistical Planning and Inference, 57, 109–142.
Article MathSciNet Google Scholar
Boente, G., Rodriguez, D. (2008). Robust bandwidth selection in semiparametric partly linear regression models: Monte Carlo study and influential analysis. Computational Statistics and Data Analysis, 52, 2808–2828.
Article MathSciNet Google Scholar
Boente, G., Rodriguez, D. (2010). Robust inference in generalized partially linear models. Computational Statistics and Data Analysis, 54, 2942–2966.
Article MathSciNet Google Scholar
Boente, G., Rodriguez, D. (2012). Robust estimates in generalized partially linear single-index models. TEST, 21, 386–411.
Article MathSciNet Google Scholar
Cantoni, E., Ronchetti, E. (2001). Resistant selection of the smoothing parameter for smoothing splines. Statistics and Computing, 11, 141–146.
Article MathSciNet Google Scholar
Cantoni, E., Ronchetti, E. (2006). A robust approach for skewed and heavy-tailed outcomes in the analysis of health care expenditures. Journal of Health Economics, 25, 198–213.
Article Google Scholar
Carroll, R., Fan, J., Gijbels, I., Wand, M. (1997). Generalized partially linear single-index models. Journal of the American Statistical Association, 92, 477–489.
Article MathSciNet Google Scholar
Chang, Z. Q., Xue, L. G., Zhu, L. X. (2010). On an asymptotically more efficient estimation of the single-index model. Journal of Multivariate Analysis, 101, 1898–1901.
Article MathSciNet Google Scholar
Croux, C., Ruiz-Gazen, A. (2005). High breakdown estimators for principal components: the projection-pursuit approach revisited. Journal of Multivariate Analysis, 95, 206–226.
Article MathSciNet Google Scholar
Delecroix, M., Hristache, M., Patilea, V. (2006). On semiparametric $M$-estimation in single-index regression. Journal of Statistical Planning and Inference, 136, 730–769.
Article MathSciNet Google Scholar
Hampel, F. R. (1974). The influence curve and its role in robust estimation. Journal of the American Statistical Association, 69, 383–394.
Article MathSciNet Google Scholar
Härdle, W., Hall, P., Ichimura, H. (1993). Optimal smoothing in single-index models. Annals of Statistics, 21, 157–178.
Article MathSciNet Google Scholar
Härdle, W., Stoker, T. M. (1989). Investigating smooth multiple regression by method of average derivatives. Journal of the American Statistical Association, 84, 986–95.
MathSciNet MATH Google Scholar
Hubert, M., Vandervieren, E. (2008). An adjusted boxplot for skewed distributions. Computational Statistics and Data Analysis, 52, 5186–5201.
Article MathSciNet Google Scholar
Leung, D. (2005). Cross-validation in nonparametric regression with outliers. Annals of Statistics, 33, 2291–2310.
Article MathSciNet Google Scholar
Leung, D., Marriott, F., Wu, E. (1993). Bandwidth selection in robust smoothing. Journal of Nonparametric Statistics, 4, 333–339.
Article MathSciNet Google Scholar
Li, W., Patilea, W. (2017). A new inference approach for single-index models. Journal of Multivariate Analysis, 158, 47–59.
Article MathSciNet Google Scholar
Liu, J., Zhang, R., Zhao, W., Lv, Y. (2013). A robust and efficient estimation method for single index models. Journal of Multivariate Analysis, 122, 226–238.
Article MathSciNet Google Scholar
Mallows, C. (1974). On some topics in robustness. Memorandum, Bell Laboratories, Murray Hill, N.J.
Manchester, L. (1996). Empirical influence for robust smoothing. Australian Journal of Statistics, 38, 275–296.
Article MathSciNet Google Scholar
Marazzi, A., Yohai, V. (2004). Adaptively truncated maximum likelihood regression with asymmetric errors. Journal of Statistical Planning and Inference, 122, 271–291.
Article MathSciNet Google Scholar
Maronna, R., Martin, D., Yohai, V. (2006). Robust statistics: Theory and methods. New York: Wiley.
Book Google Scholar
Pollard, D. (1984). Convergence of stochastic processes. Springer series in statistics. New York: Springer.
Google Scholar
Powell, J. L., Stock, J. H., Stoker, T. M. (1989). Semiparametric estimation of index coefficients. Econometrica, 57, 1403–30.
Article MathSciNet Google Scholar
Rodriguez, D. (2007). Estimación robusta en modelos parcialmente lineales generalizados. Ph.D. Thesis (in spanish), Universidad de Buenos Aires. http://cms.dm.uba.ar/academico/carreras/doctorado/tesisdanielarodriguez.pdf. Accessed 20 Feb 2019.
Rousseeuw, P. J., Yohai, V. J. (1984). Robust regression by means of $S$-estimators. In J. Franke, W. Hardle, D. Martin (Eds.), Robust and nonlinear time series, Lecture notes in statistics (Vol. 26, pp. 256–272). New York: Springer.
Severini, T., Staniswalis, J. (1994). Quasi-likelihood estimation in semiparametric models. Journal of the American Statistical Association, 89, 501–511.
Article MathSciNet Google Scholar
Severini, T., Wong, W. (1992). Profile likelihood and conditionally parametric models. Annals of Statistics, 20(4), 1768–1802.
Article MathSciNet Google Scholar
Sherman, R. (1994). Maximal inequalities for degenerate $U$-processes with applications to optimization estimators. Annals of Statistics, 22, 439–459.
Article MathSciNet Google Scholar
Sun, Y., Genton, M. G. (2011). Functional boxplots. Journal of Computational and Graphical Statistics, 20, 316–334.
Article MathSciNet Google Scholar
Tamine, J. (2002). Smoothed influence function: Another view at robust nonparametric regression. Discussion paper 62, Sonderforschungsbereich 373, Humboldt-Universiät zu Berlin.
Tukey, J. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.
MATH Google Scholar
van der Vaart, A. (1988). Estimating a real parameter in a class of semiparametric models. Annals of Statistics, 16(4), 1450–1474.
Article MathSciNet Google Scholar
Wang, F., Scott, D. (1994). The L1 method for robust nonparametric regression. Journal of the American Statistical Association, 89, 65–76.
MathSciNet Google Scholar
Wang, Q., Zhang, T., Hädle, W. (2014). An extended single index model with missing response at random. SFB 649 Discussion Paper 2014-003.
Wu, T. Z., Yu, K., Yu, Y. (2010). Single index quantile regression. Journal of Multivariate Analysis, 101, 1607–1621.
Article MathSciNet Google Scholar
Xia, Y., Härdle, W. (2006). Semi-parametric estimation of partially linear single-index models. Journal of Multivariate Analysis, 97, 1162–1184.
Article MathSciNet Google Scholar
Xia, Y., Härdle, W., Linton, O. (2012). Optimal smoothing for a computationally and efficient single index estimator. In Exploring research frontiers in contemporary statistics and econometrics: A Festschrift for Léopold Simar (pp. 229–261).
Xia, Y., Tong, H., Li, W. K., Zhu, L. (2002). An adaptive estimation of dimension reduction space (with discussion). Journal of the Royal Statistical Society, Series B, 64, 363–410.
Article MathSciNet Google Scholar
Xue, L. G., Zhu, L. X. (2006). Empirical likelihood for single-index model. Journal of Multivariate Analysis, 97, 1295–1312.
Article MathSciNet Google Scholar
Zhang, R., Huang, R., Lv, Z. (2010). Statistical inference for the index parameter in single-index models. Journal of Multivariate Analysis, 101, 1026–1041.
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors wish to thank an anonymous referee for valuable comments which led to an improved version of the original paper. This research was partially supported by Grants pict 2014-0351 from anpcyt, Grants 20120130100279BA and 20020170100022BA from the Universidad de Buenos Aires at Buenos Aires, Argentina and also by the Spanish Project MTM2016-76969P from the Ministry of Science and Innovation, Spain. It was also supported by the Italian–Argentinian project Metodi robusti per la previsione del costo e della durata della degenza ospedaliera funded by the joint collaboration program MINCYT-MAE AR14MO6 (IT1306) between mincyt from Argentina and mae from Italy.

Author information

Authors and Affiliations

Dipartimento di Matematica, Università di Trento, Via Sommarive, 14, 38123, Trento, Italy
Claudio Agostinelli
Instituto de Cálculo, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Ciudad Universitaria, Pabellón 2, 1428, Buenos Aires, Argentina
Ana M. Bianco
Departamento de Matemáticas, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and IMAS, CONICET, Ciudad Universitaria, Pabellón 1, 1428, Buenos Aires, Argentina
Graciela Boente

Authors

Claudio Agostinelli
View author publications
You can also search for this author in PubMed Google Scholar
Ana M. Bianco
View author publications
You can also search for this author in PubMed Google Scholar
Graciela Boente
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Graciela Boente.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

1.1 A.1 Proof of Theorem 1

a) For any $\varepsilon >0$, let ${\mathcal {X}}_0$ be a compact set such that $P(\mathbf {x}\notin {\mathcal {X}}_0)<\varepsilon $. Then, we have that

$$\begin{aligned}&\displaystyle \sup _{{\varvec{{\varvec{\beta }}}},\mathbf {b}\in {\mathcal {S}}_1; a \in {\mathcal {K}}}\left| \varDelta _n({\varvec{\beta }},\widehat{\eta }_{\mathbf {b},a},a) -\varDelta _n({\varvec{\beta }},\eta _{\mathbf {b},a},a)\right| \\&\quad \le \sup _{\mathbf {b}\in {\mathcal {S}}_1,a \in {\mathcal {K}}}\Vert \widehat{\eta }_{\mathbf {b},a}-\eta _{\mathbf {b},a}\Vert _{0,\infty } \Vert \tau \Vert _{\infty } \Vert \phi ^{\prime }\Vert _{\infty } +\, 2 \Vert \phi \Vert _{\infty }\frac{1}{n}\sum _{i=1}^n \mathbb {I}_{(\mathbf {x}_i\notin {\mathcal {X}}_0)} \tau (\mathbf {x}_i) \end{aligned}$$

and so, using (10), the fact that $P(\mathbf {x}\notin {\mathcal {X}}_0)<\varepsilon $ and the strong law of large numbers, we get that

$$\begin{aligned} \displaystyle \sup _{{\varvec{{\varvec{\beta }}}},\mathbf {b}\in {\mathcal {S}}_1; a \in {\mathcal {K}}}\left| \varDelta _n({\varvec{\beta }},\widehat{\eta }_{\mathbf {b},a},a)-\varDelta _n({\varvec{\beta }},\eta _{\mathbf {b},a},a)\right| \buildrel {a.s.}\over \longrightarrow 0\;. \end{aligned}$$

Therefore, it remains to show that $\displaystyle \sup _{{\varvec{{\varvec{\beta }}}},\mathbf {b}\in {\mathcal {S}}_1; a \in {\mathcal {K}}}\left| \varDelta _n({\varvec{\beta }},\eta _{\mathbf {b},a},a)-\varDelta ({\varvec{\beta }},\eta _{\mathbf {b},a},a)\right| \buildrel {a.s.}\over \longrightarrow 0$. Define the following class of functions ${\mathcal {H}}=\{f_{{\varvec{{\varvec{\beta }}}}}(y,\mathbf {x})=\phi (y,\eta _{\mathbf {b},a}({\varvec{\beta }}^{\textsc {t}}\mathbf {x}), a) \tau (\mathbf {x}) \,,\, {\varvec{\beta }},\mathbf {b}\in {\mathcal {S}}_1, a \in {\mathcal {K}}\}$. Using Theorem 3 from Chapter 2 in Pollard (1984), the compactness of ${\mathcal {K}}$, A1, the continuity of $\eta _{{\varvec{{\varvec{\beta }}}}, \alpha }(u)$ given in A6 and analogous arguments to those considered in Lemma 1 from Bianco and Boente (2002), we get that $\displaystyle \sup _{{\varvec{{\varvec{\beta }}}},\mathbf {b}\in {\mathcal {S}}_1; a \in {\mathcal {K}}}\left| \varDelta _n({\varvec{\beta }},\widehat{\eta }_{\mathbf {b},a},a)-\varDelta ({\varvec{\beta }},\eta _{\mathbf {b},a},a)\right| \buildrel {a.s.}\over \longrightarrow 0$ and a) follows.

b) Let $\widehat{{\varvec{\beta }}}_k$ be a subsequence of $\widehat{{\varvec{\beta }}}$ such that $\widehat{{\varvec{\beta }}}_k\rightarrow {\varvec{\beta }}^*$, where ${\varvec{\beta }}^*$ lies in the compact set ${\mathcal {S}}_1$. Let us assume, without loss of generality, that $\widehat{{\varvec{\beta }}} \buildrel {a.s.}\over \longrightarrow {\varvec{\beta }}^*$. Then, A7, the continuity of $\eta _{{\varvec{{\varvec{\beta }}}}, \alpha }$, the consistency of $\widehat{\alpha }_{{\textsc {r}}}$ and a) entail that $\varDelta _n(\widehat{{\varvec{\beta }}},\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}},{\varvec{\widehat{\alpha }}}_{{\textsc {r}}}},\widehat{\alpha }_{{\textsc {r}}})-\varDelta ({\varvec{\beta }}^*,\eta _0,\alpha _0) \buildrel {a.s.}\over \longrightarrow 0$ and $\varDelta _n( {\varvec{\beta }}_0,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}},{\varvec{\widehat{\alpha }}}_{{\textsc {r}}}},\widehat{\alpha }_{{\textsc {r}}} )-\varDelta ({\varvec{\beta }}_0,\eta _{0},\alpha _0) \buildrel {a.s.}\over \longrightarrow 0$, since $\eta _{{\varvec{{\varvec{\beta }}}}_0,\alpha _0}=\eta _0$. Now, using that $\varDelta _n( {\varvec{\beta }}_0,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}},{\varvec{\widehat{\alpha }}}_{{\textsc {r}}}},\widehat{\alpha }_{{\textsc {r}}} )\ge \varDelta _n(\widehat{{\varvec{\beta }}},\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}},{\varvec{\widehat{\alpha }}}_{{\textsc {r}}}},\widehat{\alpha }_{{\textsc {r}}})$ and $\varDelta ({\varvec{\beta }},\eta _0,\alpha _0)$ has a unique minimum at ${\varvec{\beta }}_0$, we conclude the proof. $\square $

1.2 A.2 Proof of Proposition 1

a) The single-index parameter estimation related to Step LG2 is obtained by means of the minimization with respect to ${\varvec{\beta }}$ of

$$\begin{aligned} \sum _{i=1}^n \rho \left( \frac{\sqrt{d\left( y_i, \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}} \left( {\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i\right) \right) }}{c}\right) \tau (\mathbf {x}_i) , \end{aligned}$$

among the vectors of length one, where, at the same time, $\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)$ is defined as

$$\begin{aligned} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)= & {} \displaystyle \mathop {\text{ argmin }}_{a\in \mathbb {R}} \sum _{i=1}^n \rho \left( \frac{\sqrt{d(y_i,a)}}{c}\right) W_{h}(u,{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i ). \end{aligned}$$

Hence, if we denote ${\mathcal {B}}({\varvec{\theta }})={\varvec{\theta }}/\Vert {\varvec{\theta }}\Vert $, we have that $\widehat{{\varvec{\beta }}}_{\varepsilon }=\widehat{{\varvec{\theta }}}_{\varepsilon }/\Vert \widehat{{\varvec{\theta }}}_{\varepsilon }\Vert ={\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon })$ where $\widehat{{\varvec{\theta }}}_{\varepsilon }$ is the solution of

$$\begin{aligned}&\mathop {\text{ argmin }}_{{\varvec{{\varvec{\theta }}}}} \frac{1-\varepsilon }{n} \sum _{i=1}^n \rho \left( \frac{\sqrt{d\left( y_i, \widehat{\eta }^{\varepsilon }_{{\mathcal {B}}({\varvec{{\varvec{\theta }}}})} \left( {\mathcal {B}}({\varvec{\theta }})^{\textsc {t}}\mathbf {x}_i\right) \right) }}{c}\right) \tau (\mathbf {x}_i)\\&\quad +\, \varepsilon \, \rho \left( \frac{\sqrt{d\left( y_0, \widehat{\eta }^{\varepsilon }_{{\mathcal {B}}({\varvec{{\varvec{\theta }}}})} \left( {\mathcal {B}}({\varvec{\theta }})^{\textsc {t}}\mathbf {x}_0\right) \right) }}{c}\right) \tau (\mathbf {x}_0). \end{aligned}$$

Then, $\widehat{{\varvec{\theta }}}_{\varepsilon }$ satisfies

$$\begin{aligned} \mathbf{{0}}= & {} \left( \mathbf{I}- {\mathcal {B}}\left( \widehat{{\varvec{\theta }}}_{\varepsilon }\right) {\mathcal {B}}\left( \widehat{{\varvec{\theta }}}_{\varepsilon }\right) ^{\textsc {t}}\right) \left[ \frac{(1-\varepsilon )}{n}\sum _{i=1}^n \psi \left( y_i,\widehat{\eta }^{\varepsilon }_{{\mathcal {B}}({\varvec{\widehat{{\varvec{\theta }}}}}_{\varepsilon })} ({\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon })^{\textsc {t}}\mathbf {x}_i),c\right) \widehat{{\varvec{\nu }}}_i^{\epsilon }\left( {\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon }),{\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon }) \mathbf {x}_i\right) \tau (\mathbf {x}_i) \right. \\&\left. +\; \varepsilon \; \psi \left( y_0,\widehat{\eta }^{\varepsilon }_{{\mathcal {B}}({\varvec{\widehat{{\varvec{\theta }}}}}_{\varepsilon })}({\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon })^{\textsc {t}}\mathbf {x}_0),c\right) \widehat{{\varvec{\nu }}}_0^{\epsilon }\left( {\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon }),{\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon }) \mathbf {x}_0\right) \tau (\mathbf {x}_0) \right] \; , \end{aligned}$$

where

$$\begin{aligned} \psi (y,a,c)= \frac{\partial }{\partial a} \phi (y,a,c)= \frac{1}{2c}\varPsi \left( \frac{\sqrt{d(y,a)}}{c}\right) \frac{1- \exp (y-a)}{\sqrt{d(y,a)}} \end{aligned}$$

as defined in (11), $\varPsi $ stands for the derivative of $\rho $ and $\widehat{{\varvec{\nu }}}_i^{\epsilon }(\mathbf {b},t)$ are given by

$$\begin{aligned} \widehat{{\varvec{\nu }}}_i^{\epsilon }(\mathbf {b},t)={ \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}^{\epsilon }(s)|_{({\varvec{{\varvec{\beta }}}},s)=(\mathbf {b},t)}+ \frac{\partial }{\partial s} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}^{\epsilon }(s)|_{({\varvec{{\varvec{\beta }}}},s)=(\mathbf {b},t)}\;\mathbf {x}_i} \; . \end{aligned}$$

Using that $\widehat{{\varvec{\beta }}}_{\varepsilon }={\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon })$, we get that the estimator $\widehat{{\varvec{\beta }}}_{\varepsilon }$ verifies

$$\begin{aligned} \mathbf{{0}}= & {} \left( \mathbf{I}- {\widehat{{\varvec{\beta }}}_{\varepsilon }} {\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}}\right) \left[ \frac{(1-\varepsilon )}{n}\sum _{i=1}^n \psi \left( y_i,\widehat{\eta }^{\varepsilon }_{{\varvec{\widehat{{\varvec{\beta }}}}}_{\varepsilon }}({\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_i),c\right) \widehat{{\varvec{\nu }}}_i^{\epsilon }({\widehat{{\varvec{\beta }}}_{\varepsilon }},{\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_i) \tau (\mathbf {x}_i) \right. \\&\left. +\; \varepsilon \; \psi \left( y_0,\widehat{\eta }^{\varepsilon }_{{\varvec{\widehat{{\varvec{\beta }}}}}_{\varepsilon }}({\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_0),c\right) \widehat{{\varvec{\nu }}}_0^{\epsilon }({\widehat{{\varvec{\beta }}}_{\varepsilon }},{\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_0) \tau (\mathbf {x}_0) \right] \; \end{aligned}$$

and $\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u)$ is the solution of

$$\begin{aligned} \frac{(1-\varepsilon )}{n}\sum _{i=1}^n \psi \left( y_i,\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) W_{h}(u,{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i ) +\varepsilon \; \psi \left( y_0,\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) W_{h}(u,{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_0 ) = 0 \; . \end{aligned}$$

(A.1)

Then, if we call

$$\begin{aligned} {\varvec{\lambda }}(\varepsilon )= & {} \frac{(1-\varepsilon )}{n}\sum _{i=1}^n \psi \left( y_i,\widehat{\eta }^{\varepsilon }_{{\varvec{\widehat{{\varvec{\beta }}}}}_{\varepsilon }}({\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_i),c\right) \widehat{{\varvec{\nu }}}_i^{\epsilon }({\widehat{{\varvec{\beta }}}_{\varepsilon }},{\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_i) \tau (\mathbf {x}_i) \\&+\; \varepsilon \; \psi \left( y_0,\widehat{\eta }^{\varepsilon }_{{\varvec{\widehat{{\varvec{\beta }}}}}_{\varepsilon }}({\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_0),c\right) \widehat{{\varvec{\nu }}}_0^{\epsilon }({\widehat{{\varvec{\beta }}}_{\varepsilon }},{\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_0) \tau (\mathbf {x}_0) \end{aligned}$$

we get that, for any $0\le \epsilon <1$, $\widehat{{\varvec{\beta }}}_{\varepsilon }$ satisfies $ \mathbf{{0}}= \left( \mathbf{I}- {\widehat{{\varvec{\beta }}}_{\varepsilon }} {\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}}\right) \; {\varvec{\lambda }}(\varepsilon ). $ Therefore, differentiating with respect to $\varepsilon $ and evaluating at $\varepsilon =0$ and using that ${\varvec{\lambda }}(0)=\mathbf{{0}}$, we obtain that

$$\begin{aligned} \mathbf{{0}}= & {} \frac{\partial }{\partial \varepsilon }\left. \left[ \left( \mathbf{I}- {\widehat{{\varvec{\beta }}}_{\varepsilon }} {\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}}\right) {\varvec{\lambda }}(\varepsilon ) \right] \right| _{\varepsilon =0} =\frac{\partial }{\partial \varepsilon }\left. \left[ \left( \mathbf{I}- {\widehat{{\varvec{\beta }}}_{\varepsilon }} {\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}}\right) \right] \right| _{\varepsilon =0} {\varvec{\lambda }}(0) + \left( \mathbf{I}- {\widehat{{\varvec{\beta }}}} {\widehat{{\varvec{\beta }}}^{\textsc {t}}}\right) \frac{\partial }{\partial \varepsilon }\left. {\varvec{\lambda }}(\varepsilon ) \right| _{\varepsilon =0} \nonumber \\= & {} \left( \mathbf{I}- {\widehat{{\varvec{\beta }}}} {\widehat{{\varvec{\beta }}}^{\textsc {t}}}\right) \frac{\partial }{\partial \varepsilon }\left. {\varvec{\lambda }}(\varepsilon ) \right| _{\varepsilon =0} \,. \end{aligned}$$

(A.2)

Henceforth, in order to compute $\left. ({\partial {\varvec{\lambda }}(\varepsilon )}/{\partial \varepsilon }) \right| _{\varepsilon =0}$ and to simplify the presentation, we consider the following functions:

$$\begin{aligned} h(\varepsilon ,{\varvec{\beta }},u) =\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u)\;,\qquad h_{{\varvec{{\varvec{\beta }}}}}(\varepsilon ,{\varvec{\beta }},u) = \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u)\;,\qquad h_{u}(\varepsilon ,{\varvec{\beta }},u) = \frac{\partial }{\partial u} \widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u) \end{aligned}$$

and their corresponding derivatives with respect to $\varepsilon $

$$\begin{aligned} H_i= & {} \left. \frac{\partial }{\partial \varepsilon } h(\varepsilon ,\widehat{{\varvec{\beta }}}_{\varepsilon },{\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_i)\right| _{\varepsilon =0}\;,\qquad H_{{\varvec{{\varvec{\beta }}}},i}= \left. \frac{\partial }{\partial \varepsilon } h_{{\varvec{{\varvec{\beta }}}}}(\varepsilon ,\widehat{{\varvec{\beta }}}_{\varepsilon },{\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_i)\right| _{\varepsilon =0}\;,\qquad \\ H_{u,i}= & {} \left. \frac{\partial }{\partial \varepsilon } h_u(\varepsilon ,\widehat{{\varvec{\beta }}}_{\varepsilon },{\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}} \mathbf {x}_i)\right| _{\varepsilon =0}\;. \end{aligned}$$

Thus, we have that

$$\begin{aligned} \left. \frac{\partial }{\partial \varepsilon } {\varvec{\lambda }}(\varepsilon ) \right| _{\varepsilon =0}= & {} -\frac{1}{n} \sum _{i=1}^n \psi \left( y_i,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i),c\right) \widehat{{\varvec{\nu }}}_i({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i) \tau (\mathbf {x}_i) \\&+\; \frac{1}{n} \sum _{i=1}^n \left\{ \chi \left( y_i,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i),c\right) \; H_i \; \widehat{{\varvec{\nu }}}_i({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i)\right. \\&\left. +\, \psi \left( y_i,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i),c\right) \; (H_{{\varvec{{\varvec{\beta }}}},i}+ \mathbf {x}_i H_{u,i})\right\} \tau (\mathbf {x}_i)\\&+ \; \psi \left( y_0,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_0),c\right) \widehat{{\varvec{\nu }}}_0({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_0) \tau (\mathbf {x}_0). \end{aligned}$$

Since ${\varvec{\lambda }}(0)=\mathbf{{0}}$, we obtain that

$$\begin{aligned} \left. \frac{\partial }{\partial \varepsilon } {\varvec{\lambda }}(\varepsilon ) \right| _{\varepsilon =0}= & {} \frac{1}{n} \sum _{i=1}^n \left\{ \chi \left( y_i,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i),c\right) \; H_i \; \widehat{{\varvec{\nu }}}_i({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i) \right. \nonumber \\&\left. + \psi \left( y_i,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i),c\right) \; (H_{{\varvec{{\varvec{\beta }}}},i}+ \mathbf {x}_i H_{u,i})\right\} \tau (\mathbf {x}_i) \nonumber \\&+ \; \psi \left( y_0,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_0),c\right) \widehat{{\varvec{\nu }}}_0({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_0) \tau (\mathbf {x}_0). \end{aligned}$$

(A.3)

It remains to compute the functions $H_i$, $H_{{\varvec{{\varvec{\beta }}}},i}$ and $ H_{u,i}$. Straightforward arguments lead to

$$\begin{aligned} H_i= & {} \left. \frac{\partial }{\partial \varepsilon } h(\varepsilon ,\widehat{{\varvec{\beta }}}_{\varepsilon }, \widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}\mathbf {x}_i)\right| _{\varepsilon =0}\\= & {} \left. \frac{\partial }{\partial \varepsilon } h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s})=(0,\widehat{\mathbf {s}}_i)} +\left. \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s}) =(0,\widehat{\mathbf {s}}_i)}\left. \frac{\partial }{\partial \varepsilon } \widehat{{\varvec{\beta }}}_{\varepsilon }\right| _{\varepsilon =0}\\&+\left. \frac{\partial }{\partial u} h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s}) =(0,\widehat{\mathbf {s}}_i)}\left. \frac{\partial }{\partial \varepsilon } \widehat{{\varvec{\beta }}}_{\varepsilon }\right| _{\varepsilon =0} \mathbf {x}_i\; , \end{aligned}$$

where $\widehat{\mathbf {s}}_i=(\widehat{{\varvec{\beta }}},\widehat{{\varvec{\beta }}}^{\textsc {t}}\mathbf {x}_i)$. Then, we get that

$$\begin{aligned} H_i= & {} \left. \mathop {\mathrm{EIF}}(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u))\right| _{({\varvec{{\varvec{\beta }}}}, u)=\widehat{\mathbf {s}}_i}+\left. \frac{\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial {\varvec{{\varvec{\beta }}}}} \right| _{({\varvec{{\varvec{\beta }}}}, u)=\widehat{\mathbf {s}}_i}\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}) +\left. \frac{\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial u} \right| _{({\varvec{{\varvec{\beta }}}}, u)=\widehat{\mathbf {s}}_i}\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})\mathbf {x}_i\\= & {} \left. \mathop {\mathrm{EIF}}(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u))\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} + \widehat{{\varvec{\nu }}}_i(\widehat{\mathbf {s}}_i)\;. \end{aligned}$$

Analogously, we have that

$$\begin{aligned} H_{{\varvec{{\varvec{\beta }}}},i}= & {} \left. \frac{\partial }{\partial \varepsilon } h{{\varvec{{\varvec{\beta }}}}}(\varepsilon ,\widehat{{\varvec{\beta }}}_{\varepsilon },\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}\mathbf {x}_i)\right| _{\varepsilon =0}\\= & {} \left. \frac{\partial }{\partial \varepsilon }\frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s})=(0,\widehat{\mathbf {s}}_i)} +\left. \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s})=(0,\widehat{\mathbf {s}}_i)}\left. \frac{\partial }{\partial \varepsilon }\widehat{{\varvec{\beta }}}_{\varepsilon }\right| _{\varepsilon =0}\\&+\left. \frac{\partial }{\partial u} \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s})=(0,\widehat{\mathbf {s}}_i)} \left. \frac{\partial }{\partial \varepsilon }\widehat{{\varvec{\beta }}}_{\varepsilon }\right| _{\varepsilon =0} \mathbf {x}_i\; , \end{aligned}$$

so

$$\begin{aligned} H_{{\varvec{{\varvec{\beta }}}},i}= & {} \left. \mathop {\mathrm{EIF}}\left( \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\right) \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}+\left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial ^2{\varvec{{\varvec{\beta }}}}} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}) +\left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial u \partial {\varvec{{\varvec{\beta }}}}}\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\\&\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})\mathbf {x}_i. \end{aligned}$$

Finally, in a similar way, we obtain that

$$\begin{aligned} H_{u,i}= & {} \left. \frac{\partial }{\partial \varepsilon } h_{u}(\varepsilon ,\widehat{{\varvec{\beta }}}_{\varepsilon },\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}\mathbf {x}_i)\right| _{\varepsilon =0}\\= & {} \left. \frac{\partial }{\partial \varepsilon }\frac{\partial }{\partial u} h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s})=(0,\widehat{\mathbf {s}}_i)} +\left. \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \frac{\partial }{\partial u} h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s})=(0,\widehat{\mathbf {s}}_i)}\left. \frac{\partial }{\partial \varepsilon }\widehat{{\varvec{\beta }}}_{\varepsilon }\right| _{\varepsilon =0}\\&+\left. \frac{\partial }{\partial u } \frac{\partial }{\partial u} h(\varepsilon ,{\varvec{\beta }},u)\right| _{(\varepsilon ,\mathbf {s})=(0,\widehat{\mathbf {s}}_i)} \left. \frac{\partial }{\partial \varepsilon }\widehat{{\varvec{\beta }}}_{\varepsilon }\right| _{\varepsilon =0} \mathbf {x}_i\; , \end{aligned}$$

which implies that

$$\begin{aligned} H_{u,i}= & {} \left. \mathop {\mathrm{EIF}}\left( \frac{\partial }{\partial u} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\right) \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}+\left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial {\varvec{{\varvec{\beta }}}}\partial u} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})\\&+\left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial ^2 u }\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})\mathbf {x}_i\; . \end{aligned}$$

Using the previous expressions, we deduce that

$$\begin{aligned} H_{{\varvec{{\varvec{\beta }}}},i}+ \mathbf {x}_i H_{u,i}= & {} \left. \mathop {\mathrm{EIF}}\left( \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\right) \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} + \left. \mathop {\mathrm{EIF}}\left( \frac{\partial }{\partial u} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\right) \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i\; \\&+ \left[ \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial ^2{\varvec{{\varvec{\beta }}}}} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} + \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial ^2 u} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i \mathbf {x}_i^{\textsc {t}}\right. \\&\left. +\left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial u \partial {\varvec{{\varvec{\beta }}}}} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i^{\textsc {t}}+ \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial {\varvec{{\varvec{\beta }}}}\partial u} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i^{\textsc {t}}\right] \mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})\; . \end{aligned}$$

Now, replacing in (A.3) $H_i$, $H_{{\varvec{{\varvec{\beta }}}},i}$ and $H_{u,i}$ with the obtained expression, we have that

$$\begin{aligned} \left. \frac{\partial }{\partial \varepsilon } {\varvec{\lambda }}(\varepsilon ) \right| _{\varepsilon =0}= & {} \frac{1}{n} \sum _{i=1}^n \chi \left( y_i,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i),c\right) \; \tau (\mathbf {x}_i) \; {\left. \mathop {\mathrm{EIF}}(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u))\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}} \; \widehat{{\varvec{\nu }}}_i({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i) \nonumber \\&+\; \frac{1}{n} \sum _{i=1}^n \chi \left( y_i,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i),c\right) \; \; \tau (\mathbf {x}_i) \; \widehat{{\varvec{\nu }}}_i({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i) \widehat{{\varvec{\nu }}}_i({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i)^{\textsc {t}}\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})\\&+ \; \frac{1}{n} \sum _{i=1}^n \psi \left( y_i,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_i),c\right) \; \tau (\mathbf {x}_i) \left\{ \left. \mathop {\mathrm{EIF}}\left( \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\right) \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\right. \\&\left. +\left. \mathop {\mathrm{EIF}}\left( \frac{\partial }{\partial u} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\right) \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i\;\right. \\&+\;\left. \left[ \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) }{\partial ^2{\varvec{{\varvec{\beta }}}}} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} + \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) }{\partial ^2 u} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i \mathbf {x}_i^{\textsc {t}}+\left. \frac{\partial ^2\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) }{\partial u \partial {\varvec{{\varvec{\beta }}}}} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i^{\textsc {t}}\right. \right. \\&\left. \left. + \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) }{\partial {\varvec{{\varvec{\beta }}}}\partial u} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i^{\textsc {t}}\right] \mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})\right\} \\&+\; \psi \left( y_0,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}({\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_0),c\right) \widehat{{\varvec{\nu }}}_0({\widehat{{\varvec{\beta }}}},{\widehat{{\varvec{\beta }}}^{\textsc {t}}} \mathbf {x}_0) \tau (\mathbf {x}_0). \end{aligned}$$

Recall that

$$\begin{aligned} \mathbf {V}(\widehat{\mathbf {s}}_i)= & {} \left[ \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) }{\partial ^2{\varvec{{\varvec{\beta }}}}} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} + \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) }{\partial ^2 u} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i \mathbf {x}_i^{\textsc {t}}+\left. \frac{\partial ^2\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) }{\partial u \partial {\varvec{{\varvec{\beta }}}}} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i^{\textsc {t}}\right. \\&\left. + \left. \frac{\partial ^2 \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) }{\partial {\varvec{{\varvec{\beta }}}}\partial u} \right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i} \mathbf {x}_i^{\textsc {t}}\right] . \end{aligned}$$

Then, we get that

$$\begin{aligned} \left. \frac{\partial }{\partial \varepsilon } {\varvec{\lambda }}(\varepsilon ) \right| _{\varepsilon =0} = \varvec{\ell }_n+ \mathbf {M}_n \mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}), \end{aligned}$$

where $\varvec{\ell }_n\in \mathbb {R}^q$ and $\mathbf {M}_n\in \mathbb {R}^{q\times q}$ are defined in (20) and (21). Replacing in (A.2), we have that

$$\begin{aligned} \mathbf{{0}}= & {} \left( \mathbf {I}- {\widehat{{\varvec{\beta }}}} {\widehat{{\varvec{\beta }}}^{\textsc {t}}}\right) (\varvec{\ell }_n+ \mathbf {M}_n \mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}) ) . \end{aligned}$$

It is worth noticing that since $\Vert \widehat{{\varvec{\beta }}}_{\varepsilon }\Vert ^2=1$, differentiating with respect to $\varepsilon $ and evaluating at $\varepsilon =0$, we have that

$$\begin{aligned} 0=\left. \frac{\partial }{\partial \varepsilon }\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}\widehat{{\varvec{\beta }}}_{\varepsilon } \right| _{\varepsilon =0}= 2 \widehat{{\varvec{\beta }}}^{\textsc {t}}\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}) \; \end{aligned}$$

which, taking into account that $\widehat{{\varvec{\beta }}}=\mathbf {e}_q$, implies that $\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})_q=0$. Therefore, we only have to compute $\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})_j$ for $j=1,\dots ,q-1$.

Using again that $\widehat{{\varvec{\beta }}}=\mathbf {e}_q$, we obtain that

$$\begin{aligned} \left( \mathbf {I}- {\widehat{{\varvec{\beta }}}} {\widehat{{\varvec{\beta }}}^{\textsc {t}}}\right) = \left( \begin{array}{cc} \mathbf {I}_{q-1}&{}\quad \mathbf {0}\\ \mathbf {0}&{}\quad 0\end{array}\right) \,. \end{aligned}$$

Hence, we have that the left superior matrix of $\left( \mathbf {I}- {\widehat{{\varvec{\beta }}}} {\widehat{{\varvec{\beta }}}^{\textsc {t}}}\right) \mathbf {M}_n$ equals the matrix $\mathbf {M}_{n,1}\in \mathbb {R}^{(q-1)\times (q-1)}$, so that $ \mathbf{{0}}= \left( \mathbf {I}- {\widehat{{\varvec{\beta }}}} {\widehat{{\varvec{\beta }}}^{\textsc {t}}}\right) (\varvec{\ell }_n+ \mathbf {M}_n \mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}) ) $ implies

$$\begin{aligned} \mathbf{{0}}= \varvec{\ell }_n^{(q-1)}+ \mathbf {M}_{n,1} \mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}^{(q-1)}). \end{aligned}$$

(A.4)

Therefore, from (A.4) we get that $\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}^{(q-1)}) = - \mathbf {M}_{n,1}^{-1} \varvec{\ell }_n^{(q-1)} $.

It is worth noticing that $\varvec{\ell }_n$ and $\mathbf {M}_n$ involve $\left. \mathop {\mathrm{EIF}}(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u))\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}$,$ \left. \mathop {\mathrm{EIF}}( {\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}/{\partial {\varvec{{\varvec{\beta }}}}})\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}$ and $ \left. \mathop {\mathrm{EIF}}( {\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}/{\partial u})\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}$.

b) Let us derive $ \mathop {\mathrm{EIF}}(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)) $. Since $\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u)$ is the solution of (A.1), we have that

$$\begin{aligned} \frac{(1-\varepsilon )}{n}\sum _{i=1}^n K_h({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u) \psi \left( y_i,\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) +\varepsilon \; K_h({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_0-u) \psi \left( y_0,\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) = 0. \end{aligned}$$

Differentiating with respect to $\varepsilon $ and evaluating at $\varepsilon =0$, we obtain that

$$\begin{aligned} \mathop {\mathrm{EIF}}(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u))= -\frac{K_h({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_0-u) \psi \left( y_0,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) }{\displaystyle \frac{1}{n}\sum \nolimits _{i=1}^n K_h({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u) \psi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) } \,. \end{aligned}$$

(A.5)

Analogously, differentiating first with respect to ${\varvec{\beta }}$ on both sides of Eq. (A.1) and then, with respect to $\varepsilon $ and evaluating at $\varepsilon =0$, we can obtain an expression for $ \left. \mathop {\mathrm{EIF}}({\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}/{\partial {\varvec{\beta }}})\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}$. Alternatively, we may differentiate (A.5) with respect to ${\varvec{\beta }}$ to obtain

$$\begin{aligned}&\mathop {\mathrm{EIF}}\left( \frac{\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\partial {\varvec{{\varvec{\beta }}}}}\right) = -\frac{\frac{1}{h} K_h^{\prime }({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_0-u) \psi \left( y_0,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) \mathbf {x}_0+ K_h ({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_0-u) \chi \left( y_0,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) \displaystyle \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}{\displaystyle \frac{1}{n}\sum \nolimits _{i=1}^n K_h({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u) \psi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) }\\&\quad +\; \frac{K_h({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_0-u) \psi \left( y_0,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) }{\left\{ \displaystyle \frac{1}{n}\sum \nolimits _{i=1}^n K_h({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u) \psi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) \right\} ^2}\, \left[ \frac{1}{n} \sum _{i=1}^n \frac{1}{h} K_h^{\prime }({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u) \psi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) \mathbf {x}_i\right. \\&\qquad \left. +\; \frac{1}{n} \sum _{i=1}^n K_h ({\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u) \chi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),c\right) \displaystyle \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\right] \,. \end{aligned}$$

Similar arguments lead to the expression for $ \left. \mathop {\mathrm{EIF}}({\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}/{\partial u})\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}$.

Finally, note that $\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)$, satisfies

$$\begin{aligned} \sum _{i=1}^n K\left( \frac{{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u}{h}\right) \psi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),\alpha \right)= & {} 0 . \end{aligned}$$

(A.6)

Hence, differentiating with respect to ${\varvec{\beta }}$ equation (A.6), we get that

$$\begin{aligned} 0= & {} \frac{1}{h} \sum _{i=1}^n K^{\prime }\left( \frac{{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u}{h}\right) \psi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),\alpha \right) \mathbf {x}_i \\&+ \;\sum _{i=1}^n K\left( \frac{{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u}{h}\right) \chi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),\alpha \right) \times \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u), \end{aligned}$$

which implies that

$$\begin{aligned} \frac{\partial }{\partial {\varvec{{\varvec{\beta }}}}} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)= & {} -\frac{1}{h} \left[ \sum _{i=1}^n K\left( \frac{{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u}{h}\right) \chi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),\alpha \right) \right] ^{-1} \,\\&\sum _{i=1}^n K^{\prime }\left( \frac{{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u}{h}\right) \psi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),\alpha \right) \mathbf {x}_i \;. \end{aligned}$$

On the other hand, differentiating (A.6) with respect to u, we obtain that

$$\begin{aligned} 0= & {} -\frac{1}{h} \sum _{i=1}^n K^{\prime }\left( \frac{{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u}{h}\right) \psi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),\alpha \right) \\&+ \;\sum _{i=1}^n K\left( \frac{{\varvec{\beta }}^{\textsc {t}}\mathbf {x}_i-u}{h}\right) \chi \left( y_i,\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u),\alpha \right) \times \frac{\partial }{\partial u} \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u) \end{aligned}$$

which entails that

About this article

Cite this article

Agostinelli, C., Bianco, A.M. & Boente, G. Robust estimation in single-index models when the errors have a unimodal density with unknown nuisance parameter. Ann Inst Stat Math 72, 855–893 (2020). https://doi.org/10.1007/s10463-019-00712-8

Download citation

Received: 26 January 2018
Revised: 15 February 2019
Published: 21 March 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s10463-019-00712-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust estimation in single-index models when the errors have a unimodal density with unknown nuisance parameter

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Overfitting, Model Tuning, and Evaluation of Prediction Performance

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Appendix

1.1 A.1 Proof of Theorem 1

1.2 A.2 Proof of Proposition 1

About this article

Cite this article

Keywords

Navigation

Robust estimation in single-index models when the errors have a unimodal density with unknown nuisance parameter

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Overfitting, Model Tuning, and Evaluation of Prediction Performance

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Appendix

A Appendix

1.1 A.1 Proof of Theorem 1

1.2 A.2 Proof of Proposition 1

About this article

Cite this article

Share this article

Keywords

Search

Navigation