Skip to main content
Log in

Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

An Erratum to this article was published on 25 May 2013

Abstract

The penalized maximum likelihood estimator (PMLE) has been widely used for variable selection in high-dimensional data. Various penalty functions have been employed for this purpose, e.g., Lasso, weighted Lasso, or smoothly clipped absolute deviations. However, the PMLE can be very sensitive to outliers in the data, especially to outliers in the covariates (leverage points). In order to overcome this disadvantage, the usage of the penalized maximum trimmed likelihood estimator (PMTLE) is proposed to estimate the unknown parameters in a robust way. The computation of the PMTLE takes advantage of the same technology as used for PMLE but here the estimation is based on subsamples only. The breakdown point properties of the PMTLE are discussed using the notion of \(d\)-fullness. The performance of the proposed estimator is evaluated in a simulation study for the classical multiple linear and Poisson linear regression models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723

    Article  MATH  MathSciNet  Google Scholar 

  • Alfons A, Croux C, Gelper S (2013) Sparse least trimmed squares regression. Ann Appl Stat 7:226–248. doi:10.1214/12-AOAS575

    Google Scholar 

  • Antoniadis A, Gijbels I, Nikolova M (2011) Penalized likelihood regression for generalized linear models with non-quadratic penalties. Ann Inst Stat Math 63:585–615. doi:10.1007/s10463-009-0242-4

    Article  MathSciNet  Google Scholar 

  • Breheny P, Huang J (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann Appl Stat 5:232–253. doi:10.1214/10-AOAS388

    Article  MATH  MathSciNet  Google Scholar 

  • Bühlmann P, van der Geer S (2011) Statistics for high dimensional data: methods theory and applications. Springer, New York

    Book  Google Scholar 

  • Čížek P (2008) General trimmed estimation: robust approach to nonlinear and limited dependent variable models. Econom Theory 24:1500–1529

    Article  MATH  Google Scholar 

  • Čížek P (2010) Reweighted least trimmed squares: an alternative to one-step estimators. Center Discussion Paper 2010/91, Tilburg University, The Netherlands

  • Croux C, Haesbroeck G (2010) Robust scatter regularization. Compstat, Book of Abstracts, Paris: Conservatoire National des Arts et Metiers (CNAM) and the French National Institute for Research in Computer Science and Control (INRIA)

  • Dimova R, Neykov NM (2004) Generalized d-fullness technique for breakdown point study of the trimmed likelihood estimator with applications. In: Hubert M, Pison G, Struyf A, Van Aelst S (eds) Theory and applications of recent robust methods. Birkhäuser, Basel, pp 83–91

    Chapter  Google Scholar 

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Statist 32:407–499

    Article  MATH  MathSciNet  Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360

    Article  MATH  MathSciNet  Google Scholar 

  • Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space (with discussion). J Royal Stat Soc B 70:849–911

    Article  MathSciNet  Google Scholar 

  • Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Statistica Sinica 20:101–148

    MATH  MathSciNet  Google Scholar 

  • Fan J, Samworth R, Wu Y (2009) Ultrahigh dimensional variable selection: beyond the linear model. J Mach Learn Res 10:1829–1853

    MathSciNet  Google Scholar 

  • Fan J, Song R (2010) Sure independence screening in generalized linear models with NP-dimensionality. Ann Stat 38:3567–3604

    Article  MATH  MathSciNet  Google Scholar 

  • Frank IE, Friedman JH (1993) A statistical view of some chemometrics regression tools (with discussion). Technometrics 35:109–148

    Article  MATH  Google Scholar 

  • Friedman J, Hastie T, Höfling H, Tibshirani R (2007) Pathwise coordinate optimization. Ann Appl Stat 1:302–332

    Article  MATH  MathSciNet  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22. http://www.jstatsoft.org/v33/i01/

    Google Scholar 

  • Gervini D, Yohai VJ (2002) A class of robust and fully efficient regression estimators. Ann Stat 30:583–616

    Article  MATH  MathSciNet  Google Scholar 

  • Green PJ (1984) Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives. J Royal Stat Soc Ser B 46:149–192

    MATH  Google Scholar 

  • Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics. The approach based on influence functions. Wiley, New York

    MATH  Google Scholar 

  • Huber PJ (1981) Robust statistics. Wiley, New York

    Book  MATH  Google Scholar 

  • Khan JA, Van Aelst S, Zamar RH (2007) Robust linear model selection based on least angle regression. J Am Stat Assoc 102:1289–1299

    Article  MATH  Google Scholar 

  • Markatou M, Basu A, Lindsay B (1997) Weighted likelihood estimating equations: the discrete case with applications to logistic regression. J Stat Plan Inference 57:215–232

    Article  MATH  MathSciNet  Google Scholar 

  • Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics: theory and methods. Wiley, New York

    Book  Google Scholar 

  • Mizera I, Müller CH (1999) Breakdown points and variation exponents of robust M-estimators in linear models. Ann Statist 27:1164–1177

    Article  MATH  MathSciNet  Google Scholar 

  • Müller CH (1995) Breakdown points for designed experiments. J Stat Plan Inference 45:413–427

    Article  MATH  Google Scholar 

  • Müller CH, Neykov NM (2003) Breakdown points of the trimmed likelihood and related estimators in generalized linear models. J Stat Plan Inference 116:503–519

    Article  MATH  Google Scholar 

  • Neykov NM, Neytchev P (1990) A robust alternative of the maximum likelihood estimators. COMPSTAT’90—Short Communications, Dubrovnik, Yugoslavia, pp 99–100

  • Neykov NM, Müller CH (2003) Breakdown point and computation of trimmed likelihood estimators in generalized linear models. In: Dutter R, Filzmoser P, Gather U, Rousseeuw PJ (eds) Developments in robust statistics. Physica-Verlag, Heidelberg, pp 277–286

    Chapter  Google Scholar 

  • Neykov NM, Filzmoser P, Neytchev PN (2012a) Robust joint modeling of mean and dispersion through trimming. Comput Stat Data Anal 56:34–48. doi:10.1016/j.csda.2011.07.007

    Article  MATH  MathSciNet  Google Scholar 

  • Neykov NM, Cizek P, Filzmoser P, Neytchev PN (2012b) The least trimmed quantile regression. Comput Stat Data Anal 56:1757–1770. doi:10.1016/j.csda.2011.10.023

    Article  MATH  MathSciNet  Google Scholar 

  • R Development Core Team (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

  • Rousseeuw PJ (1984) Least median of squares regression. J Am Stat Assoc 79:851–857

    Article  MathSciNet  Google Scholar 

  • Rousseeuw PJ, Van Driessen K (1999) Computing least trimmed of squares regression for large data sets. Estadistica 54:163–190

    Google Scholar 

  • Schwartz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J Royal Stat Soc Ser B 58:267–288

    MATH  MathSciNet  Google Scholar 

  • Tibshirani R (1997) The lasso method for variable selection in the Cox model. Stat Med 16:385–395

    Article  Google Scholar 

  • Vandev DL, Neykov NM (1993) Robust maximum likelihood in the Gaussian case. In: Ronchetti E, Stahel WA (eds) New directions in data analysis and robustness. Birkhäuser Verlag, Basel, pp 259–264

    Google Scholar 

  • Vandev DL, Neykov NM (1998) About regression estimators with high breakdown point. Statistics 32:111–129

    Article  MATH  MathSciNet  Google Scholar 

  • Wang H, Li G, Jiang G (2007) Robust regression shrinkage and consistent variable selection through the LAD-lasso. J Bus & Econ Stat 25:347–355

    Article  MathSciNet  Google Scholar 

  • Zhang CH (2008) Discussion of one-step sparse estimates in nonconcave penalized likelihood models by H. Zou and R. Li. Ann Stat 36:1553–1560

    Article  MATH  Google Scholar 

  • Zou H (2006) The Adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429

    Article  MATH  Google Scholar 

  • Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Ann Stat 36:1509–1533

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors are thankful to the Vienna University of Technology and the ESF (COST Action IC0702) for supporting the stay of N. Neykov and P. Neytchev in Vienna.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N. M. Neykov.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Neykov, N.M., Filzmoser, P. & Neytchev, P.N. Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator. Stat Papers 55, 187–207 (2014). https://doi.org/10.1007/s00362-013-0516-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-013-0516-z

Keywords

Navigation