Skip to main content
Log in

Sparse estimation in high-dimensional linear errors-in-variables regression via a covariate relaxation method

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Sparse signal recovery in high-dimensional settings via regularization techniques has been developed in the past two decades and produces fruitful results in various areas. Previous studies mainly focus on the idealized assumption where covariates are free of noise. However, in realistic scenarios, covariates are always corrupted by measurement errors, which may induce significant estimation bias when methods for clean data are naively applied. Recent studies begin to deal with the errors-in-variables models. Current method either depends on the distribution of covariate noise or does not depends on the distribution but is inconsistent in parameter estimation. A novel covariate relaxation method that does not depend on the distribution of covariate noise is proposed. Statistical consistency on parameter estimation is established. Numerical experiments are conducted and show that the covariate relaxation method achieves the same or even better estimation accuracy than that of the state of art nonconvex Lasso estimator. The advantage that the covariate relaxation method is independent of the distribution of covariate noise while produces a small estimation error suggests its prospect in practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agarwal, A., Negahban, S., Wainwright, M.J.: Fast global convergence of gradient methods for high-dimensional statistical recovery. Ann Statist 40(5), 2452–2482 (2012)

    Article  MathSciNet  Google Scholar 

  • Agarwal, A., Negahban, S.N., Wainwright, M.J.: Supplementary material: fast global convergence of gradient methods for high-dimensional statistical recovery. Ann. Stat. (2012b)

  • Belloni, A., Rosenbaum, M., Tsybakov, A.B.: An \(\ell _1, \ell _2, \ell _\infty \)-regularization approach to high-dimensional errors-in-variables models. Electron. J. Stat. 10(2), 1729–1750 (2016)

    Article  MathSciNet  Google Scholar 

  • Belloni, A., Rosenbaum, M., Tsybakov, A.B.: Linear and conic programming estimators in high dimensional errors-in-variables models. J. R. Stat. Soc. Ser. B Stat. Methodol. 79(3), 939–956 (2017)

    Article  MathSciNet  Google Scholar 

  • Bickel, P.J., Ritov, Y., Tsybakov, A.B.: Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37(4), 1705–1732 (2009)

    Article  MathSciNet  Google Scholar 

  • Brown, B., Weaver, T., Wolfson, J.: MEBoost: variable selection in the presence of measurement error. Stat. Med. 38(15), 2705–2718 (2019)

    Article  MathSciNet  Google Scholar 

  • Bühlmann, P., Van De Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Berlin (2011)

    Book  Google Scholar 

  • Candès, E.J., Tao, T.: The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). Ann. Stat. 35(6), 2313–2351 (2007)

    MathSciNet  Google Scholar 

  • Datta, A., Zou, H.: CoColasso for high-dimensional error-in-variables regression. Ann. Stat. 45(6), 2400–2426 (2017)

    Article  MathSciNet  Google Scholar 

  • Fan, J.Q., Li, R.Z.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    Article  MathSciNet  Google Scholar 

  • Gustavo, H.M.A.R., Reinaldo, B.A.V., Rosangela, H.L.: Maximum likelihood methods in a robust censored errors-in-variables model. Test 24(4), 857–877 (2015)

    Article  MathSciNet  Google Scholar 

  • Han, K., Song, D.: Errors-in-variables Frechet regression with low-rank covariate approximation (2023) . arXiv preprint arXiv:2305.09282

  • Huang, X.D., Bao, N.N., Xu, K., et al.: Variable selection in high-dimensional error-in-variables models via controlling the false discovery proportion. Commun. Math. Stat. 10, 123–151 (2021)

    Article  MathSciNet  Google Scholar 

  • Li, X., Wu, D.Y.: Minimax rates of \(\ell _p\)-losses for high-dimensional linear errors-in-variables models over \(\ell _q\)-balls. Entropy 23(6), 722 (2021)

    Article  Google Scholar 

  • Li, X., Wu, D.Y., Li, C., et al.: Sparse recovery via nonconvex regularized M-estimators over \(\ell _q\)-balls. Comput. Stat. Data Anal. 152(107), 047 (2020)

    Google Scholar 

  • Li, X., Hu, Y.H., Li, C., et al.: Sparse estimation via lower-order penalty optimization methods in high-dimensional linear regression. J. Global Optim. pp. 1–35 (2022)

  • Li, X., Wu, D.Y.: Low-rank matrix estimation via nonconvex optimization methods in multi-response errors-in-variables regression. J. Global Optim. (2023)

  • Loh, P.L., Wainwright, M.J.: Corrupted and missing predictors: minimax bounds for high-dimensional linear regression. In: IEEE International Symposium on Information Theory—Proceedings, pp. 2601–2605 (2012a)

  • Loh, P.L., Wainwright, M.J.: High-dimensional regression with noisy and missing data: provable guarantees with nonconvexity. Ann. Stat. 40(3), 1637–1664 (2012)

    Article  MathSciNet  Google Scholar 

  • Loh, P.L., Wainwright, M.J.: Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima. J. Mach. Learn. Res. 16(1), 559–616 (2015)

    MathSciNet  Google Scholar 

  • Ma, Y.Y., Li, R.Z.: Variable selection in measurement error models. Bernoulli (Andover) 16(1), 274–300 (2010)

    MathSciNet  Google Scholar 

  • Negahban, S., Wainwright, M.J.: Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann. Stat. pp. 1069–1097 (2011)

  • Nesterov, Y.: Gradient methods for minimizing composite objective function. Université catholique de Louvain, Center for Operations Research and Econometrics (CORE), Technical report (2007)

  • Raskutti, G., Wainwright, M.J., Yu, B.: Minimax rates of estimation for high-dimensional linear regression over \(\ell _q\)-balls. IEEE Trans. Inform. Theory 57(10), 6976–6994 (2011)

    Article  MathSciNet  Google Scholar 

  • Romeo, G., Thoresen, M.: Model selection in high-dimensional noisy data: a simulation study. J. Stat. Comput. Simul. 89(1), 1–20 (2019)

    MathSciNet  Google Scholar 

  • Rosenbaum, M., Tsybakov, A.B.: Sparse recovery under matrix uncertainty. Ann. Stat. 38(5), 2620–2651 (2010)

    Article  MathSciNet  Google Scholar 

  • Rosenbaum, M., Tsybakov, A.B.: Improved matrix uncertainty selector. In: From Probability to Statistics and Back: High-Dimensional Models and Processes—A Festschrift in Honor of Jon A. Wellner. Institute of Mathematical Statistics, pp. 276–290 (2013)

  • Sørensen, Ø., Frigessi, A., Thoresen, M.: Measurement error in LASSO: impact and likelihood bias correction. Stat. Sin. 25, 809–829 (2015)

    MathSciNet  Google Scholar 

  • Sørensen, Ø., Hellton, K.H., Frigessi, A., et al.: Covariate selection in high-dimensional generalized linear models with measurement error. J. Comput. Graph. Statist. 27(4), 739–749 (2018)

  • Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58(1), 267–288 (1996)

  • Van De Geer, S.A., Bühlmann, P.: On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3, 1360–1392 (2009)

    MathSciNet  Google Scholar 

  • Vershynin, R.: High-Dimensional Probability (An Introduction with Applications in Data Science). Cambridge University Press, Cambridge (2018)

    Google Scholar 

  • Wainwright, M.J.: High-Dimensional Statistics: A Non-asymptotic Viewpoint, vol. 48. Cambridge University Press, Cambridge (2019)

    Google Scholar 

  • Wu, J., Zheng, Z.M., Li, Y., et al.: Scalable interpretable learning for multi-response error-in-variables regression. J. Multivar. Anal. 179, 104644 (2020)

    Article  MathSciNet  Google Scholar 

  • Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Xin Li’s work was supported in part by the National Natural Science Foundation of China (12201496), and the Natural Science Foundation of Shaanxi Province of China (2022JQ-045). Dongya Wu’work was supported in part by he National Natural Science Foundation of China (62103329).

Author information

Authors and Affiliations

Authors

Contributions

XL and DW wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Dongya Wu.

Ethics declarations

Conflict of interest

The authors have no competing interests that are directly or indirectly related to the work submitted for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., Wu, D. Sparse estimation in high-dimensional linear errors-in-variables regression via a covariate relaxation method. Stat Comput 34, 4 (2024). https://doi.org/10.1007/s11222-023-10312-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-023-10312-5

Keywords

Mathematics Subject Classification

Navigation