Robust regression using biased objectives
- 804 Downloads
For the regression task in a non-parametric setting, designing the objective function to be minimized by the learner is a critical task. In this paper we propose a principled method for constructing and minimizing robust losses, which are resilient to errant observations even under small samples. Existing proposals typically utilize very strong estimates of the true risk, but in doing so require a priori information that is not available in practice. As we abandon direct approximation of the risk, this lets us enjoy substantial gains in stability at a tolerable price in terms of bias, all while circumventing the computational issues of existing procedures. We analyze existence and convergence conditions, provide practical computational routines, and also show empirically that the proposed method realizes superior robustness over wide data classes with no prior knowledge assumptions.
KeywordsRobust loss Heavy-tailed noise Risk minimization
- Abramowitz, M., & Stegun, I. A. (1964). Handbook of mathematical functions with formulas, graphs, and mathematical tables, National Bureau of Standards Applied Mathematics Series (Vol. 55). US National Bureau of Standards.Google Scholar
- Catoni, O. (2009). High confidence estimates of the mean of heavy-tailed real random variables. arXiv preprint arXiv:0909.5366.
- Dellacherie, C., & Meyer, P. A. (1978). Probabilities and potential, North-Holland Mathematics Studies (Vol. 29). Amsterdam: North-Holland.Google Scholar
- Devroye, L., Lerasle, M., Lugosi, G., & Oliveira, R. I. (2015). Sub-Gaussian mean estimators. arXiv preprint arXiv:1509.05845.
- Hsu, D., & Sabato, S. (2014). Heavy-tailed regression with a generalized median-of-means. In Proceedings of the 31st international conference on machine learning (ICML2014) (pp. 37–45).Google Scholar
- Lerasle, M., & Oliveira, R. I. (2011). Robust empirical mean estimators. arXiv preprint arXiv:1112.3914.
- Lugosi, G., & Mendelson, S. (2016). Risk minimization by median-of-means tournaments. arXiv preprint arXiv:1608.00757.
- R Core Team. (2016). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. https://www.R-project.org/
- Rousseeuw, P., & Yohai, V. (1984). Robust regression by means of S-estimators. In Robust and nonlinear time series analysis, Lecture Notes in Statistics (Vol. 26, pp. 256–272). Berlin: Springer.Google Scholar
- Srebro, N., Sridharan, K., & Tewari, A. (2010). Smoothness, low noise and fast rates. In J. D. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, & A. Culotta (Eds.), Advances in neural information processing systems (Vol. 23, pp. 2199–2207).Google Scholar
- Steele, J. M. (1975). Combinatorial entropy and uniform limit laws, Ph.D thesis. Stanford University.Google Scholar
- Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B (Methodological), 58(1), 267–288.Google Scholar
- Yu, Y., Aslan, Ö., & Schuurmans, D. (2012). A polynomial-time form of robust regression. Advances in Neural Information Processing Systems, 25, 2483–2491.Google Scholar