Abstract
This paper develops a robust profile estimation method for the parametric and nonparametric components of a single-index model when the errors have a strongly unimodal density with unknown nuisance parameter. We derive consistency results for the link function estimators as well as consistency and asymptotic distribution results for the single-index parameter estimators. Under a log-Gamma model, the sensitivity to anomalous observations is studied using the empirical influence curve. We also discuss a robust K-fold cross-validation procedure to select the smoothing parameters. A numerical study carried on with errors following a log-Gamma model and for contaminated schemes shows the good robustness properties of the proposed estimators and the advantages of considering a robust approach instead of the classical one. A real data set illustrates the use of our proposal.
Similar content being viewed by others
References
Aït Sahalia, Y. (1995). The delta method for nonaparmetric kernel functionals. Ph.D. dissertation, University of Chicago.
Bianco, A., Boente, G. (2002). On the asymptotic behavior of one-step estimation. Statistics and Probability Letters, 60, 33–47.
Bianco, A., Boente, G. (2007). Robust estimators under a semiparametric partly linear autoregression model: asymptotic behavior and bandwidth selection. Journal of Time Series Analysis, 28, 274–306.
Bianco, A., García Ben, M., Yohai, V. (2005). Robust estimation for linear regression with asymmetric errors. Canadian Journal Statistics, 33, 511–528.
Boente, G., Fraiman, R., Meloche, J. (1997). Robust plug-in bandwidth estimators in nonparametric regression. Journal of Statistical Planning and Inference, 57, 109–142.
Boente, G., Rodriguez, D. (2008). Robust bandwidth selection in semiparametric partly linear regression models: Monte Carlo study and influential analysis. Computational Statistics and Data Analysis, 52, 2808–2828.
Boente, G., Rodriguez, D. (2010). Robust inference in generalized partially linear models. Computational Statistics and Data Analysis, 54, 2942–2966.
Boente, G., Rodriguez, D. (2012). Robust estimates in generalized partially linear single-index models. TEST, 21, 386–411.
Cantoni, E., Ronchetti, E. (2001). Resistant selection of the smoothing parameter for smoothing splines. Statistics and Computing, 11, 141–146.
Cantoni, E., Ronchetti, E. (2006). A robust approach for skewed and heavy-tailed outcomes in the analysis of health care expenditures. Journal of Health Economics, 25, 198–213.
Carroll, R., Fan, J., Gijbels, I., Wand, M. (1997). Generalized partially linear single-index models. Journal of the American Statistical Association, 92, 477–489.
Chang, Z. Q., Xue, L. G., Zhu, L. X. (2010). On an asymptotically more efficient estimation of the single-index model. Journal of Multivariate Analysis, 101, 1898–1901.
Croux, C., Ruiz-Gazen, A. (2005). High breakdown estimators for principal components: the projection-pursuit approach revisited. Journal of Multivariate Analysis, 95, 206–226.
Delecroix, M., Hristache, M., Patilea, V. (2006). On semiparametric \(M\)-estimation in single-index regression. Journal of Statistical Planning and Inference, 136, 730–769.
Hampel, F. R. (1974). The influence curve and its role in robust estimation. Journal of the American Statistical Association, 69, 383–394.
Härdle, W., Hall, P., Ichimura, H. (1993). Optimal smoothing in single-index models. Annals of Statistics, 21, 157–178.
Härdle, W., Stoker, T. M. (1989). Investigating smooth multiple regression by method of average derivatives. Journal of the American Statistical Association, 84, 986–95.
Hubert, M., Vandervieren, E. (2008). An adjusted boxplot for skewed distributions. Computational Statistics and Data Analysis, 52, 5186–5201.
Leung, D. (2005). Cross-validation in nonparametric regression with outliers. Annals of Statistics, 33, 2291–2310.
Leung, D., Marriott, F., Wu, E. (1993). Bandwidth selection in robust smoothing. Journal of Nonparametric Statistics, 4, 333–339.
Li, W., Patilea, W. (2017). A new inference approach for single-index models. Journal of Multivariate Analysis, 158, 47–59.
Liu, J., Zhang, R., Zhao, W., Lv, Y. (2013). A robust and efficient estimation method for single index models. Journal of Multivariate Analysis, 122, 226–238.
Mallows, C. (1974). On some topics in robustness. Memorandum, Bell Laboratories, Murray Hill, N.J.
Manchester, L. (1996). Empirical influence for robust smoothing. Australian Journal of Statistics, 38, 275–296.
Marazzi, A., Yohai, V. (2004). Adaptively truncated maximum likelihood regression with asymmetric errors. Journal of Statistical Planning and Inference, 122, 271–291.
Maronna, R., Martin, D., Yohai, V. (2006). Robust statistics: Theory and methods. New York: Wiley.
Pollard, D. (1984). Convergence of stochastic processes. Springer series in statistics. New York: Springer.
Powell, J. L., Stock, J. H., Stoker, T. M. (1989). Semiparametric estimation of index coefficients. Econometrica, 57, 1403–30.
Rodriguez, D. (2007). Estimación robusta en modelos parcialmente lineales generalizados. Ph.D. Thesis (in spanish), Universidad de Buenos Aires. http://cms.dm.uba.ar/academico/carreras/doctorado/tesisdanielarodriguez.pdf. Accessed 20 Feb 2019.
Rousseeuw, P. J., Yohai, V. J. (1984). Robust regression by means of \(S\)-estimators. In J. Franke, W. Hardle, D. Martin (Eds.), Robust and nonlinear time series, Lecture notes in statistics (Vol. 26, pp. 256–272). New York: Springer.
Severini, T., Staniswalis, J. (1994). Quasi-likelihood estimation in semiparametric models. Journal of the American Statistical Association, 89, 501–511.
Severini, T., Wong, W. (1992). Profile likelihood and conditionally parametric models. Annals of Statistics, 20(4), 1768–1802.
Sherman, R. (1994). Maximal inequalities for degenerate \(U\)-processes with applications to optimization estimators. Annals of Statistics, 22, 439–459.
Sun, Y., Genton, M. G. (2011). Functional boxplots. Journal of Computational and Graphical Statistics, 20, 316–334.
Tamine, J. (2002). Smoothed influence function: Another view at robust nonparametric regression. Discussion paper 62, Sonderforschungsbereich 373, Humboldt-Universiät zu Berlin.
Tukey, J. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.
van der Vaart, A. (1988). Estimating a real parameter in a class of semiparametric models. Annals of Statistics, 16(4), 1450–1474.
Wang, F., Scott, D. (1994). The L1 method for robust nonparametric regression. Journal of the American Statistical Association, 89, 65–76.
Wang, Q., Zhang, T., Hädle, W. (2014). An extended single index model with missing response at random. SFB 649 Discussion Paper 2014-003.
Wu, T. Z., Yu, K., Yu, Y. (2010). Single index quantile regression. Journal of Multivariate Analysis, 101, 1607–1621.
Xia, Y., Härdle, W. (2006). Semi-parametric estimation of partially linear single-index models. Journal of Multivariate Analysis, 97, 1162–1184.
Xia, Y., Härdle, W., Linton, O. (2012). Optimal smoothing for a computationally and efficient single index estimator. In Exploring research frontiers in contemporary statistics and econometrics: A Festschrift for Léopold Simar (pp. 229–261).
Xia, Y., Tong, H., Li, W. K., Zhu, L. (2002). An adaptive estimation of dimension reduction space (with discussion). Journal of the Royal Statistical Society, Series B, 64, 363–410.
Xue, L. G., Zhu, L. X. (2006). Empirical likelihood for single-index model. Journal of Multivariate Analysis, 97, 1295–1312.
Zhang, R., Huang, R., Lv, Z. (2010). Statistical inference for the index parameter in single-index models. Journal of Multivariate Analysis, 101, 1026–1041.
Acknowledgements
The authors wish to thank an anonymous referee for valuable comments which led to an improved version of the original paper. This research was partially supported by Grants pict 2014-0351 from anpcyt, Grants 20120130100279BA and 20020170100022BA from the Universidad de Buenos Aires at Buenos Aires, Argentina and also by the Spanish Project MTM2016-76969P from the Ministry of Science and Innovation, Spain. It was also supported by the Italian–Argentinian project Metodi robusti per la previsione del costo e della durata della degenza ospedaliera funded by the joint collaboration program MINCYT-MAE AR14MO6 (IT1306) between mincyt from Argentina and mae from Italy.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Appendix
A Appendix
1.1 A.1 Proof of Theorem 1
a) For any \(\varepsilon >0\), let \({\mathcal {X}}_0\) be a compact set such that \(P(\mathbf {x}\notin {\mathcal {X}}_0)<\varepsilon \). Then, we have that
and so, using (10), the fact that \(P(\mathbf {x}\notin {\mathcal {X}}_0)<\varepsilon \) and the strong law of large numbers, we get that
Therefore, it remains to show that \(\displaystyle \sup _{{\varvec{{\varvec{\beta }}}},\mathbf {b}\in {\mathcal {S}}_1; a \in {\mathcal {K}}}\left| \varDelta _n({\varvec{\beta }},\eta _{\mathbf {b},a},a)-\varDelta ({\varvec{\beta }},\eta _{\mathbf {b},a},a)\right| \buildrel {a.s.}\over \longrightarrow 0\). Define the following class of functions \({\mathcal {H}}=\{f_{{\varvec{{\varvec{\beta }}}}}(y,\mathbf {x})=\phi (y,\eta _{\mathbf {b},a}({\varvec{\beta }}^{\textsc {t}}\mathbf {x}), a) \tau (\mathbf {x}) \,,\, {\varvec{\beta }},\mathbf {b}\in {\mathcal {S}}_1, a \in {\mathcal {K}}\}\). Using Theorem 3 from Chapter 2 in Pollard (1984), the compactness of \({\mathcal {K}}\), A1, the continuity of \(\eta _{{\varvec{{\varvec{\beta }}}}, \alpha }(u)\) given in A6 and analogous arguments to those considered in Lemma 1 from Bianco and Boente (2002), we get that \(\displaystyle \sup _{{\varvec{{\varvec{\beta }}}},\mathbf {b}\in {\mathcal {S}}_1; a \in {\mathcal {K}}}\left| \varDelta _n({\varvec{\beta }},\widehat{\eta }_{\mathbf {b},a},a)-\varDelta ({\varvec{\beta }},\eta _{\mathbf {b},a},a)\right| \buildrel {a.s.}\over \longrightarrow 0\) and a) follows.
b) Let \(\widehat{{\varvec{\beta }}}_k\) be a subsequence of \(\widehat{{\varvec{\beta }}}\) such that \(\widehat{{\varvec{\beta }}}_k\rightarrow {\varvec{\beta }}^*\), where \({\varvec{\beta }}^*\) lies in the compact set \({\mathcal {S}}_1\). Let us assume, without loss of generality, that \(\widehat{{\varvec{\beta }}} \buildrel {a.s.}\over \longrightarrow {\varvec{\beta }}^*\). Then, A7, the continuity of \(\eta _{{\varvec{{\varvec{\beta }}}}, \alpha }\), the consistency of \(\widehat{\alpha }_{{\textsc {r}}}\) and a) entail that \(\varDelta _n(\widehat{{\varvec{\beta }}},\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}},{\varvec{\widehat{\alpha }}}_{{\textsc {r}}}},\widehat{\alpha }_{{\textsc {r}}})-\varDelta ({\varvec{\beta }}^*,\eta _0,\alpha _0) \buildrel {a.s.}\over \longrightarrow 0\) and \(\varDelta _n( {\varvec{\beta }}_0,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}},{\varvec{\widehat{\alpha }}}_{{\textsc {r}}}},\widehat{\alpha }_{{\textsc {r}}} )-\varDelta ({\varvec{\beta }}_0,\eta _{0},\alpha _0) \buildrel {a.s.}\over \longrightarrow 0\), since \(\eta _{{\varvec{{\varvec{\beta }}}}_0,\alpha _0}=\eta _0\). Now, using that \(\varDelta _n( {\varvec{\beta }}_0,\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}},{\varvec{\widehat{\alpha }}}_{{\textsc {r}}}},\widehat{\alpha }_{{\textsc {r}}} )\ge \varDelta _n(\widehat{{\varvec{\beta }}},\widehat{\eta }_{{\varvec{\widehat{{\varvec{\beta }}}}},{\varvec{\widehat{\alpha }}}_{{\textsc {r}}}},\widehat{\alpha }_{{\textsc {r}}})\) and \(\varDelta ({\varvec{\beta }},\eta _0,\alpha _0)\) has a unique minimum at \({\varvec{\beta }}_0\), we conclude the proof. \(\square \)
1.2 A.2 Proof of Proposition 1
a) The single-index parameter estimation related to Step LG2 is obtained by means of the minimization with respect to \({\varvec{\beta }}\) of
among the vectors of length one, where, at the same time, \(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\) is defined as
Hence, if we denote \({\mathcal {B}}({\varvec{\theta }})={\varvec{\theta }}/\Vert {\varvec{\theta }}\Vert \), we have that \(\widehat{{\varvec{\beta }}}_{\varepsilon }=\widehat{{\varvec{\theta }}}_{\varepsilon }/\Vert \widehat{{\varvec{\theta }}}_{\varepsilon }\Vert ={\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon })\) where \(\widehat{{\varvec{\theta }}}_{\varepsilon }\) is the solution of
Then, \(\widehat{{\varvec{\theta }}}_{\varepsilon }\) satisfies
where
as defined in (11), \(\varPsi \) stands for the derivative of \(\rho \) and \(\widehat{{\varvec{\nu }}}_i^{\epsilon }(\mathbf {b},t)\) are given by
Using that \(\widehat{{\varvec{\beta }}}_{\varepsilon }={\mathcal {B}}(\widehat{{\varvec{\theta }}}_{\varepsilon })\), we get that the estimator \(\widehat{{\varvec{\beta }}}_{\varepsilon }\) verifies
and \(\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u)\) is the solution of
Then, if we call
we get that, for any \(0\le \epsilon <1\), \(\widehat{{\varvec{\beta }}}_{\varepsilon }\) satisfies \( \mathbf{{0}}= \left( \mathbf{I}- {\widehat{{\varvec{\beta }}}_{\varepsilon }} {\widehat{{\varvec{\beta }}}_{\varepsilon }^{\textsc {t}}}\right) \; {\varvec{\lambda }}(\varepsilon ). \) Therefore, differentiating with respect to \(\varepsilon \) and evaluating at \(\varepsilon =0\) and using that \({\varvec{\lambda }}(0)=\mathbf{{0}}\), we obtain that
Henceforth, in order to compute \(\left. ({\partial {\varvec{\lambda }}(\varepsilon )}/{\partial \varepsilon }) \right| _{\varepsilon =0}\) and to simplify the presentation, we consider the following functions:
and their corresponding derivatives with respect to \(\varepsilon \)
Thus, we have that
Since \({\varvec{\lambda }}(0)=\mathbf{{0}}\), we obtain that
It remains to compute the functions \(H_i\), \(H_{{\varvec{{\varvec{\beta }}}},i}\) and \( H_{u,i}\). Straightforward arguments lead to
where \(\widehat{\mathbf {s}}_i=(\widehat{{\varvec{\beta }}},\widehat{{\varvec{\beta }}}^{\textsc {t}}\mathbf {x}_i)\). Then, we get that
Analogously, we have that
so
Finally, in a similar way, we obtain that
which implies that
Using the previous expressions, we deduce that
Now, replacing in (A.3) \(H_i\), \(H_{{\varvec{{\varvec{\beta }}}},i}\) and \(H_{u,i}\) with the obtained expression, we have that
Recall that
Then, we get that
where \(\varvec{\ell }_n\in \mathbb {R}^q\) and \(\mathbf {M}_n\in \mathbb {R}^{q\times q}\) are defined in (20) and (21). Replacing in (A.2), we have that
It is worth noticing that since \(\Vert \widehat{{\varvec{\beta }}}_{\varepsilon }\Vert ^2=1\), differentiating with respect to \(\varepsilon \) and evaluating at \(\varepsilon =0\), we have that
which, taking into account that \(\widehat{{\varvec{\beta }}}=\mathbf {e}_q\), implies that \(\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})_q=0\). Therefore, we only have to compute \(\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}})_j\) for \(j=1,\dots ,q-1\).
Using again that \(\widehat{{\varvec{\beta }}}=\mathbf {e}_q\), we obtain that
Hence, we have that the left superior matrix of \(\left( \mathbf {I}- {\widehat{{\varvec{\beta }}}} {\widehat{{\varvec{\beta }}}^{\textsc {t}}}\right) \mathbf {M}_n\) equals the matrix \(\mathbf {M}_{n,1}\in \mathbb {R}^{(q-1)\times (q-1)}\), so that \( \mathbf{{0}}= \left( \mathbf {I}- {\widehat{{\varvec{\beta }}}} {\widehat{{\varvec{\beta }}}^{\textsc {t}}}\right) (\varvec{\ell }_n+ \mathbf {M}_n \mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}) ) \) implies
Therefore, from (A.4) we get that \(\mathop {\mathrm{EIF}}(\widehat{{\varvec{\beta }}}^{(q-1)}) = - \mathbf {M}_{n,1}^{-1} \varvec{\ell }_n^{(q-1)} \).
It is worth noticing that \(\varvec{\ell }_n\) and \(\mathbf {M}_n\) involve \(\left. \mathop {\mathrm{EIF}}(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u))\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\),\( \left. \mathop {\mathrm{EIF}}( {\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}/{\partial {\varvec{{\varvec{\beta }}}}})\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\) and \( \left. \mathop {\mathrm{EIF}}( {\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}/{\partial u})\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\).
b) Let us derive \( \mathop {\mathrm{EIF}}(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)) \). Since \(\widehat{\eta }^{\varepsilon }_{{\varvec{{\varvec{\beta }}}}}(u)\) is the solution of (A.1), we have that
Differentiating with respect to \(\varepsilon \) and evaluating at \(\varepsilon =0\), we obtain that
Analogously, differentiating first with respect to \({\varvec{\beta }}\) on both sides of Eq. (A.1) and then, with respect to \(\varepsilon \) and evaluating at \(\varepsilon =0\), we can obtain an expression for \( \left. \mathop {\mathrm{EIF}}({\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}/{\partial {\varvec{\beta }}})\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\). Alternatively, we may differentiate (A.5) with respect to \({\varvec{\beta }}\) to obtain
Similar arguments lead to the expression for \( \left. \mathop {\mathrm{EIF}}({\partial \widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)}/{\partial u})\right| _{({\varvec{{\varvec{\beta }}}},u)=\widehat{\mathbf {s}}_i}\).
Finally, note that \(\widehat{\eta }_{{\varvec{{\varvec{\beta }}}}}(u)\), satisfies
Hence, differentiating with respect to \({\varvec{\beta }}\) equation (A.6), we get that
which implies that
On the other hand, differentiating (A.6) with respect to u, we obtain that
which entails that
About this article
Cite this article
Agostinelli, C., Bianco, A.M. & Boente, G. Robust estimation in single-index models when the errors have a unimodal density with unknown nuisance parameter. Ann Inst Stat Math 72, 855–893 (2020). https://doi.org/10.1007/s10463-019-00712-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-019-00712-8