Abstract
In nonparametric regression contexts, when the number of covariables is large, we face the curse of dimensionality. One way to deal with this problem when the sample is not large enough is using a reduced number of linear combinations of the explanatory variables that contain most of the information about the response variable. This leads to the so-called sufficient reduction problem. The purpose of this paper is to obtain robust estimators of a sufficient dimension reduction, that is, estimators which are not very much affected by the presence of a small fraction of outliers in the data. One way to derive a sufficient dimension reduction is by means of the principal fitted components (PFC) model. We obtain robust estimations for the parameters of this model and the corresponding sufficient dimension reduction based on a \(\tau \)-scale (\(\tau \)-estimators). Strong consistency of these estimators under weak assumptions of the underlying distribution is proven. The \(\tau \)-estimators for the PFC model are computed using an iterative algorithm. A Monte Carlo study compares the performance of \(\tau \)-estimators and maximum likelihood estimators. The results show clear advantages for \(\tau \)-estimators in the presence of outlier contamination and only small loss of efficiency when outliers are absent. A proposal to select the dimension of the reduction space based on cross-validation is given. These estimators are implemented in R language through functions contained in the package tauPFC. As the PFC model is a special case of multivariate reduced-rank regression, our proposal can be applied directly to this model as well.
Similar content being viewed by others
References
Adrover JG, Donato SM (2015) A robust predictive approach for canonical correlation analysis. J Multivar Anal 133:356–376
Anderson TW (1951) Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann Math Stat 22(3):327–351
Bergesio A, Szretter Noste ME, Yohai VJ (2020) tauPFC: computes robust estimators for the PFC model. R package version 0.0.1. https://github.com/meszre/tauPFC
Boente G, Fraiman R (1989) Robust nonparametric regression estimation for dependent observations. Ann Stat 17(3):1242–1256
Boente G, Martínez A (2017) Marginal integration m-estimators for additive models. TEST 26(2):231–260
Bura E, Cook RD (2001) Estimating the structural dimension of regressions via parametric inverse regression. J R Stat Soc Ser B (Stat Methodol) 63(2):393–410
Bura E, Cook RD (2003) Rank estimation in reduced-rank regression. J Multivar Anal 87(1):159–176
Bura E, Forzani L (2015) Sufficient reductions in regressions with elliptically contoured inverse predictors. J Am Stat Assoc 110(509):420–434
Bura E, Yang J (2011) Dimension estimation in sufficient dimension reduction: a unifying approach. J Multivar Anal 102(1):130–142
Bura E, Duarte S, Forzani L (2016) Sufficient reductions in regressions with exponential family inverse predictors. J Am Stat Assoc 111(515):1313–1329
Cook RD (2007) Fisher lecture: dimension reduction in regression. Stat Sci 22(1):1–26
Cook RD, Forzani L (2008) Principal fitted components for dimension reduction in regression. Stat Sci 23(4):485–501
Cook RD, Ni L (2005) Sufficient dimension reduction via inverse regression. J Am Stat Assoc 100(470):410–428
Cook RD, Weisberg S (1991) Comment. J Am Stat Assoc 86(414):328–332
Cook RD, Li B, Chiaromonte F (2010) Envelope models for parsimonious and efficient multivariate linear regression. Stat Sin 20:927–960
Cook RD, Forzani L, Tomassi D (2011) Ldr: a package for likelihood-based sufficient dimension reduction. J Stat Softw 39(1):1–20
Filzmoser P, Dehon C, Croux C (2000) Outlier resistant estimators for canonical correlation analysis. In: COMPSTAT, Springer, pp 301–306
García Ben M, Martínez E, Yohai VJ (2006) Robust estimation for the multivariate linear model based on a \(\tau \)-scale. J Multivar Anal 97(7):1600–1622
Gather U, Hilker T, Becker C (2001) A robustified version of sliced inverse regression. In: Statistics in genetics and in the environmental sciences, Springer, pp 147–157
Hampel FR (1971) A general qualitative definition of robustness. Ann Math Stat 42(6):1887–1896
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn, Springer, New York,
Huber PJ (1981) Robust statistics. Wiley, New York
Izenman AJ (1975) Reduced-rank regression for the multivariate linear model. J Multivar Anal 5(2):248–264
Li K-C (1991) Sliced inverse regression for dimension reduction. J Am Stat Assoc 86(414):316–327
Li K-C (1992) On principal hessian directions for data visualization and dimension reduction: another application of stein’s lemma. J Am Stat Assoc 87(420):1025–1039
Li B, Wang S (2007) On directional regression for dimension reduction. J Am Stat Assoc 102(479):997–1008
Li B, Zha H, Chiaromonte F (2005) Contour regression: a general approach to dimension reduction. Ann Stat 33(4):1580–1616
Li B, Artemiou A, Li L (2011) Principal support vector machines for linear and nonlinear sufficient dimension reduction. Ann Stat 39(6):3182–3210
Lopuhaä HP (1991) Multivariate \(\tau \)-estimators for location and scatter. Can J Stat 19(3):307–321
Maechler M, Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Conceicao ELT, Anna di Palma M (2020) Robustbase: basic robust statistics. R package version 0.93-6
Muler N, Yohai VJ (2002) Robust estimates for arch processes. J Time Ser Anal 23(3):341–375
Papantoni-Kazakos P, Gray RM (1979) Robustness of estimators on stationary observations. Ann Probab 7(6):989–1002
R Core Team (2019) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
Reinsel GC, Velu RP (1998) Multivariate reduced-rank regression: theory and applications. Springer, Berlin
Salibian-Barrera M, Yohai VJ (2006) A fast algorithm for s-regression estimates. J Comput Gr Stat 15(2):414–427
Scrucca L (2011) Model-based sir for dimension reduction. Comput Stat Data Anal 55(11):3010–3026
She Y, Chen K (2017) Robust reduced-rank regression. Biometrika 104(3):633–647
Szretter Noste ME (2019) Using dags to identify the sufficient dimension reduction in the principal fitted components model. Stat Probab Lett 145:317–320
Tatsuoka KS, Tyler DE (2000) On the uniqueness of s-functionals and m-functionals under nonelliptical distributions. Ann Stat 28(4):1219–1243
Todorov V, Filzmoser P (2009) An object-oriented framework for robust multivariate analysis. J Stat Softw 32(3):1–47
Tyler DE (1987) A distribution-free m-estimator of multivariate scatter. Ann Stat 15:234–251
Weisberg S (2005) Applied linear regression, vol 528. Wiley, New York
Yohai VJ (1987) High breakdown-point and high efficiency robust estimates for regression. Ann Stat 15(2):642–656
Yohai VJ, Zamar RH (1988) High breakdown-point estimates of regression by means of the minimization of an efficient scale. J Am Stat Assoc 83(402):406–413
Yohai VJ, Zamar RH (1997) Optimal locally robust m-estimates of regression. J Stat Plan Inference 64(2):309–323
Zhao W, Lian H, Ma S (2017) Robust reduced-rank modeling via rank regression. J Stat Plan Inference 180:1–12
Zhou J (2009) Robust dimension reduction based on canonical correlation. J Multivar Anal 100(1):195–209
Acknowledgements
We would like to thank the Associate Editor and referees for their comments and suggestions which have helped us to improve this paper. Also, we gratefully acknowledge partial support by Grants PICT-2016-0377 and PICT-2015-2023 from ANPCYT, and also Grants 20020170100330BA from Universidad de Buenos Aires, and 50420150100032LI from Universidad Nacional del Litoral, Argentina.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Bergesio, A., Szretter Noste, M.E. & Yohai, V.J. A robust proposal of estimation for the sufficient dimension reduction problem. TEST 30, 758–783 (2021). https://doi.org/10.1007/s11749-020-00745-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-020-00745-9