Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection

Peng, Yang; Luo, Bin; Gao, Xiaoli

doi:10.1007/s13571-022-00279-0

Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection

Published: 11 May 2022

Volume 84, pages 694–707, (2022)
Cite this article

Sankhya B Aims and scope Submit manuscript

394 Accesses
Explore all metrics

Abstract

Outlier detection has become an important and challenging issue in high-dimensional data analysis due to the coexistence of data contamination and high-dimensionality. Most existing widely used penalized least squares methods are sensitive to outliers due to the l₂ loss. In this paper, we proposed a Robust Moderately Clipped LASSO (RMCL) estimator, that performs simultaneous outlier detection, variable selection and robust estimation. The RMCL estimator can be efficiently solved using the coordinate descent algorithm in a convex-concave procedure. Our numerical studies demonstrate that the RMCL estimator possesses superiority in both variable selection and outlier detection and thus can be advantageous in difficult prediction problems with data contamination.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Introduction to the LASSO

Article 01 April 2018

Niharika Gauraha

Robust Estimation Through Preliminary Testing Based on the LAD-LASSO

A New Type of LASSO Regression Model with Cauchy Noise

Article 28 November 2023

Amir Hossein Ghatari, Mina Aminghafari & Adel Mohammadpour

References

Alfons, A., Croux, C. and Gelper, S. (2013). Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Annals of Applied Statistics 7, 226–248.
Article MathSciNet MATH Google Scholar
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association96, 1348–1360.
Article MathSciNet MATH Google Scholar
Friedman, J., Hastie, T., Höfling, H. and Tibshirani, R. (2007). Pathwise coordinate optimization. Annals of Applied Statistics 1, 302–332.
Article MathSciNet MATH Google Scholar
Gao, X. and Huang, J. (2010). Asymptotic analysis of high-dimensional LAD regression with lasso. Statistica Sinica 20, 1485–1506.
MathSciNet MATH Google Scholar
Gijbels, I., Leuven, K. and Vrinssen, I. (2015). Robust nonnegative garrote variable selection in linear regression. Computational Statistics & Data Analysis 85, 1–22.
Article MathSciNet MATH Google Scholar
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J. and Stahel, W.A. (1986). Robust statistics. Wiley Online Library, New York.
MATH Google Scholar
Huber, P.J. (1992). Robust estimation of a location parameter, in Breakthroughs in statistics. Springer, Berlin, p. 492–518.
Google Scholar
Kong, D., Bondell, H.D. and Wu, Y. (2018). Fully efficient robust estimation, outlier detection and variable selection via penalized regression. Statistica Sinica 28, 1031–1052.
MathSciNet MATH Google Scholar
Kwon, S., Lee, S. and Kim, Y. (2015). Moderately clipped lasso. Computational Statistics & Data Analysis 92, 53–67.
Article MathSciNet MATH Google Scholar
Lambert-Lacroix, S., Zwald, L. et al (2011). Robust regression through the huber’s criterion and adaptive lasso penalty. Electronic Journal of Statistics5, 1015–1053.
Article MathSciNet MATH Google Scholar
Luo, B. and Gao, X. (2021). A high-dimensional m-estimator framework for bi-level variable selection. Annals of the Institute of Statistical Mathematics 1–21.
Ma, S. and Wu, C. (2014). A selective review of robust variable selection with applications in bioinformatics, Briefings in Bioinformatics (1). https://doi.org/10.1093/bib/bbu046.
Nguyen, N.H. and Tran, T.D. (2012). Robust lasso with missing and grossly corrupted observations. IEEE Transactions on Information Theory 59, 2036–2058.
Article MathSciNet MATH Google Scholar
Oshima, R.G., Baribault, H. and Caulín, C. (1996). Oncogenic regulation and function of keratins 8 and 18. Cancer and Metastasis Reviews 15, 4, 445–471.
Article Google Scholar
Shankavaram, U.T., Reinhold, W.C., Nishizuka, S., Major, S., Morita, D., Chary, K.K., Reimers, M.A., Scherf, U., Kahn, A., Dolginow, D., Cossman, J., Kaldjian, E.P., Scudiero, D.A., Petricoin, E., Liotta, L., Lee, J.K. and Weinstein, J.N. (2007). Transcript and protein expression profiles of the NCI-60 cancer cell panel: An integromic microarray study. Molecular Cancer Therapeutics6, 820–832.
Article Google Scholar
She, Y. and Owen, A. (2011). Outlier detection using nonconvex penalize regression. J. Amer. Stat. Assoc. 106, 626–639.
Article MATH Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B 58, 267–288.
MathSciNet MATH Google Scholar
Wang, L. (2013). The ℓ₁ penalized LAD estimator for high dimensional linear regression. Journal of Multivariate Analysis 120, 135–151.
Article MathSciNet MATH Google Scholar
Wang, X., Jiang, Y., Huang, M. and Zhang, H. (2013). Robust variable selection with exponential squared loss. Journal of the American Statistical Association 108, 632–643.
Article MathSciNet MATH Google Scholar
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S. and Ma, Y. (2008). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 210–227.
Article Google Scholar
Yuille, A.L. and Rangarajan, A. (2003). The concave-convex procedure. Neural Computation 15, 915–936.
Article MATH Google Scholar
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics 38, 894–942.
Article MathSciNet MATH Google Scholar
Zhang, C.-H., Zhang, T. et al (2012). A general theory of concave regularization for high-dimensional sparse estimation problems. Statistical Science 27, 576–593.
Article MathSciNet MATH Google Scholar
Zhao, P. and Yu, B. (2006). On model selection consistency of lasso. Journal of Machine Learning Research 7, 2541–2567.
MathSciNet MATH Google Scholar
Zhao, X., Zhao, K., Gao, X. and Deng, J. (2015). Leveraging big data analytics to investigate online sellers’ pricing strategies, Web 2015, 14th Workshop on e-Business.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of The American Statistical Association 101, 1418–1429.
Article MathSciNet MATH Google Scholar

Download references

Funding

Xiaoli Gao is partially supported by Simons Foundation (SF359337) and Simons Foundation (SF854940).

Author information

Authors and Affiliations

Department of Mathematics and Statistics University of North Carolina at Greensboro, Greensboro, NC, USA
Yang Peng & Xiaoli Gao
Department of Biostatistics and Bioinformatics, Duke University, Durham, USA
Bin Luo

Authors

Yang Peng
View author publications
You can also search for this author in PubMed Google Scholar
Bin Luo
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoli Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoli Gao.

Ethics declarations

Conflict of Interests

The corresponding author states that there is no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Partially supported by Simons Foundation (SF359337) and Simons Foundation (SF854940).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peng, Y., Luo, B. & Gao, X. Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection. Sankhya B 84, 694–707 (2022). https://doi.org/10.1007/s13571-022-00279-0

Download citation

Received: 02 August 2021
Accepted: 17 March 2022
Published: 11 May 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s13571-022-00279-0

Keywords

PACS Nos

Primary 62J07; Secondary 62F35

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection

Abstract

Access this article

Similar content being viewed by others

Introduction to the LASSO

Robust Estimation Through Preliminary Testing Based on the LAD-LASSO

A New Type of LASSO Regression Model with Cauchy Noise

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

PACS Nos

Navigation

Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection

Abstract

Access this article

Similar content being viewed by others

Introduction to the LASSO

Robust Estimation Through Preliminary Testing Based on the LAD-LASSO

A New Type of LASSO Regression Model with Cauchy Noise

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

PACS Nos

Search

Navigation