Skip to main content
Log in

A high-dimensional M-estimator framework for bi-level variable selection

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

In high-dimensional data analysis, bi-level sparsity is often assumed when covariates function group-wisely and sparsity can appear either at the group level or within certain groups. In such cases, an ideal model should be able to encourage the bi-level variable selection consistently. Bi-level variable selection has become even more challenging when data have heavy-tailed distribution or outliers exist in random errors and covariates. In this paper, we study a framework of high-dimensional M-estimation for bi-level variable selection. This framework encourages bi-level sparsity through a computationally efficient two-stage procedure. In theory, we provide sufficient conditions under which our two-stage penalized M-estimator possesses simultaneous local estimation consistency and the bi-level variable selection consistency if certain non-convex penalty functions are used at the group level. Both our simulation studies and real data analysis demonstrate satisfactory finite sample performance of the proposed estimators under different irregular settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Breheny, P. (2015). The group exponential lasso for bi-level variable selection. Biometrics, 71(3), 731–740.

    Article  MathSciNet  Google Scholar 

  • Breheny, P., Huang, J. (2009). Penalized methods for bi-level variable selection. Statistics and Its Interface, 2(3), 369.

    Article  MathSciNet  Google Scholar 

  • Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.

    Article  MathSciNet  Google Scholar 

  • Friedman, J., Hastie, T., Tibshirani, R. (2010). A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736.

  • Guo, X., Zhang, H., Wang, Y., Wu, J.-L. (2015). Model selection and estimation in high dimensional regression models with group SCAD. Statistics & Probability Letters, 103, 86–92.

    Article  MathSciNet  Google Scholar 

  • Hill, R. W. (1977). Robust regression when there are outliers in the carriers. PhD thesis, Harvard University.

  • Huang, J., Ma, S., Xie, H., Zhang, C.-H. (2009). A group bridge approach for variable selection. Biometrika, 96(2), 339–355.

    Article  MathSciNet  Google Scholar 

  • Huang, J., Breheny, P., Ma, S. (2012). A selective review of group selection in high-dimensional models. Statistical Science, 27(4), 481–499.

    Article  MathSciNet  Google Scholar 

  • Jiang, D., Huang, J. (2014). Concave 1-norm group selection. Biostatistics, 16(2), 252–267.

    Article  MathSciNet  Google Scholar 

  • Kita, A., Kasamatsu, A., Nakashima, D., Endo-Sakamoto, Y., Ishida, S., Shimizu, T., Kimura, Y., Miyamoto, I., Yoshimura, S., Shiiba, M., Tanzawa, H., Uzawa, K. (2017). Activin b regulates adhesion, invasiveness, and migratory activities in oral cancer: A potential biomarker for metastasis. Journal of Cancer, 8(11), 2033.

    Article  Google Scholar 

  • Li, Z.-L., Zhou, S.-F. (2016). A silac-based approach elicits the proteomic responses to vancomycin-associated nephrotoxicity in human proximal tubule epithelial hk-2 cells. Molecules, 21(2), 148.

    Article  Google Scholar 

  • Lilly, K. (2015). Robust variable selection methods for grouped data. PhD thesis, Auburn University.

  • Loh, P.-L. (2017). Statistical consistency and asymptotic normality for high-dimensional robust \(m\)-estimators. The Annals of Statistics, 45(2), 866–896.

    Article  MathSciNet  Google Scholar 

  • Loh, P.-L., Wainwright, M. J. (2015). Regularized m-estimators with nonconvexity: Statistical and algorithmic theory for local optima. The Journal of Machine Learning Research, 16(1), 559–616.

    MathSciNet  MATH  Google Scholar 

  • Mallows, C. L. (1975). On some topics in robustness. Bell Telephone Laboratories. Unpublished memorandum.

  • Merrill, H. M., Schweppe, F. C. (1971). Bad data suppression in power system static state estimation. IEEE Transactions on Power Apparatus and Systems, 6, 2718–2725.

    Article  Google Scholar 

  • Müller, C. (2004). Redescending m-estimators in regression analysis, cluster analysis and image analysis. Discussiones Mathematicae Probability and Statistics, 24(1), 59–75.

    MathSciNet  MATH  Google Scholar 

  • Nesterov, Y. (2013). Gradient methods for minimizing composite functions. Mathematical Programming, 140(1), 125–161.

    Article  MathSciNet  Google Scholar 

  • Oshima, R. G., Baribault, H., Caulín, C. (1996). Oncogenic regulation and function of keratins 8 and 18. Cancer and Metastasis Reviews, 15(4), 445–471.

    Article  Google Scholar 

  • Shankavaram, U. T., Reinhold, W. C., Nishizuka, S., Major, S., Morita, D., Chary, K. K., Reimers, M. A., Scherf, J., Kahn, A., Dolginow, D., Cossman, J., Kaldjian, E. P., Scudiero, D. A., Petricoin, E., Liotta, L., Lee, J. K., Weinstein, J. N. (2007). Transcript and protein expression profiles of the nci-60 cancer cell panel: An integromic microarray study. Molecular Cancer Therapeutics, 6(3), 820–832.

    Article  Google Scholar 

  • Shevlyakov, G., Morgenthaler, S., Shurygin, A. (2008). Redescending m-estimators. Journal of Statistical Planning and Inference, 138(10), 2906–2917.

    Article  MathSciNet  Google Scholar 

  • Simon, N., Friedman, J., Hastie, T., Tibshirani, R. (2013). A sparse-group lasso. Journal of Computational and Graphical Statistics, 22(2), 231–245.

    Article  MathSciNet  Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.

    Article  MathSciNet  Google Scholar 

  • Walker, L. C., Harris, G. C., Hooloway, A. J., Mckenzie, G. W., Wells, J. E., Robinson, B. A., Morrisa, C. M. (2007). Cytokeratin krt8/18 expression differentiates distinct subtypes of grade 3 invasive ductal carcinoma of the breast. Cancer Genetics and Cytogenetics, 178(2), 94–103.

    Article  Google Scholar 

  • Wang, M., Tian, G.-L. (2016). Robust group non-convex estimations for high-dimensional partially linear models. Journal of Nonparametric Statistics, 28(1), 49–67.

    Article  MathSciNet  Google Scholar 

  • Wei, F., Huang, J. (2010). Consistent group selection in high-dimensional linear regression. Bernoulli: Official Journal of the Bernoulli Society for Mathematical Statistics and Probability, 16(4), 1369.

    Article  MathSciNet  Google Scholar 

  • Wijayarathna, R., De Kretser, D. M. (2016). Activins in reproductive biology and beyond. Human Reproduction Update, 22(3), 342–357.

    Article  Google Scholar 

  • Yuan, M., Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1), 49–67.

    Article  MathSciNet  Google Scholar 

  • Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Luo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Gao is partially supported by Simons Foundation Grant: SF359337.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file 1 (PDF 275 kb)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, B., Gao, X. A high-dimensional M-estimator framework for bi-level variable selection. Ann Inst Stat Math 74, 559–579 (2022). https://doi.org/10.1007/s10463-021-00809-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-021-00809-z

Keywords

Navigation