Abstract
Sparse signal recovery in high-dimensional settings via regularization techniques has been developed in the past two decades and produces fruitful results in various areas. Previous studies mainly focus on the idealized assumption where covariates are free of noise. However, in realistic scenarios, covariates are always corrupted by measurement errors, which may induce significant estimation bias when methods for clean data are naively applied. Recent studies begin to deal with the errors-in-variables models. Current method either depends on the distribution of covariate noise or does not depends on the distribution but is inconsistent in parameter estimation. A novel covariate relaxation method that does not depend on the distribution of covariate noise is proposed. Statistical consistency on parameter estimation is established. Numerical experiments are conducted and show that the covariate relaxation method achieves the same or even better estimation accuracy than that of the state of art nonconvex Lasso estimator. The advantage that the covariate relaxation method is independent of the distribution of covariate noise while produces a small estimation error suggests its prospect in practical applications.
Similar content being viewed by others
References
Agarwal, A., Negahban, S., Wainwright, M.J.: Fast global convergence of gradient methods for high-dimensional statistical recovery. Ann Statist 40(5), 2452–2482 (2012)
Agarwal, A., Negahban, S.N., Wainwright, M.J.: Supplementary material: fast global convergence of gradient methods for high-dimensional statistical recovery. Ann. Stat. (2012b)
Belloni, A., Rosenbaum, M., Tsybakov, A.B.: An \(\ell _1, \ell _2, \ell _\infty \)-regularization approach to high-dimensional errors-in-variables models. Electron. J. Stat. 10(2), 1729–1750 (2016)
Belloni, A., Rosenbaum, M., Tsybakov, A.B.: Linear and conic programming estimators in high dimensional errors-in-variables models. J. R. Stat. Soc. Ser. B Stat. Methodol. 79(3), 939–956 (2017)
Bickel, P.J., Ritov, Y., Tsybakov, A.B.: Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37(4), 1705–1732 (2009)
Brown, B., Weaver, T., Wolfson, J.: MEBoost: variable selection in the presence of measurement error. Stat. Med. 38(15), 2705–2718 (2019)
Bühlmann, P., Van De Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Berlin (2011)
Candès, E.J., Tao, T.: The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). Ann. Stat. 35(6), 2313–2351 (2007)
Datta, A., Zou, H.: CoColasso for high-dimensional error-in-variables regression. Ann. Stat. 45(6), 2400–2426 (2017)
Fan, J.Q., Li, R.Z.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Gustavo, H.M.A.R., Reinaldo, B.A.V., Rosangela, H.L.: Maximum likelihood methods in a robust censored errors-in-variables model. Test 24(4), 857–877 (2015)
Han, K., Song, D.: Errors-in-variables Frechet regression with low-rank covariate approximation (2023) . arXiv preprint arXiv:2305.09282
Huang, X.D., Bao, N.N., Xu, K., et al.: Variable selection in high-dimensional error-in-variables models via controlling the false discovery proportion. Commun. Math. Stat. 10, 123–151 (2021)
Li, X., Wu, D.Y.: Minimax rates of \(\ell _p\)-losses for high-dimensional linear errors-in-variables models over \(\ell _q\)-balls. Entropy 23(6), 722 (2021)
Li, X., Wu, D.Y., Li, C., et al.: Sparse recovery via nonconvex regularized M-estimators over \(\ell _q\)-balls. Comput. Stat. Data Anal. 152(107), 047 (2020)
Li, X., Hu, Y.H., Li, C., et al.: Sparse estimation via lower-order penalty optimization methods in high-dimensional linear regression. J. Global Optim. pp. 1–35 (2022)
Li, X., Wu, D.Y.: Low-rank matrix estimation via nonconvex optimization methods in multi-response errors-in-variables regression. J. Global Optim. (2023)
Loh, P.L., Wainwright, M.J.: Corrupted and missing predictors: minimax bounds for high-dimensional linear regression. In: IEEE International Symposium on Information Theory—Proceedings, pp. 2601–2605 (2012a)
Loh, P.L., Wainwright, M.J.: High-dimensional regression with noisy and missing data: provable guarantees with nonconvexity. Ann. Stat. 40(3), 1637–1664 (2012)
Loh, P.L., Wainwright, M.J.: Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima. J. Mach. Learn. Res. 16(1), 559–616 (2015)
Ma, Y.Y., Li, R.Z.: Variable selection in measurement error models. Bernoulli (Andover) 16(1), 274–300 (2010)
Negahban, S., Wainwright, M.J.: Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann. Stat. pp. 1069–1097 (2011)
Nesterov, Y.: Gradient methods for minimizing composite objective function. Université catholique de Louvain, Center for Operations Research and Econometrics (CORE), Technical report (2007)
Raskutti, G., Wainwright, M.J., Yu, B.: Minimax rates of estimation for high-dimensional linear regression over \(\ell _q\)-balls. IEEE Trans. Inform. Theory 57(10), 6976–6994 (2011)
Romeo, G., Thoresen, M.: Model selection in high-dimensional noisy data: a simulation study. J. Stat. Comput. Simul. 89(1), 1–20 (2019)
Rosenbaum, M., Tsybakov, A.B.: Sparse recovery under matrix uncertainty. Ann. Stat. 38(5), 2620–2651 (2010)
Rosenbaum, M., Tsybakov, A.B.: Improved matrix uncertainty selector. In: From Probability to Statistics and Back: High-Dimensional Models and Processes—A Festschrift in Honor of Jon A. Wellner. Institute of Mathematical Statistics, pp. 276–290 (2013)
Sørensen, Ø., Frigessi, A., Thoresen, M.: Measurement error in LASSO: impact and likelihood bias correction. Stat. Sin. 25, 809–829 (2015)
Sørensen, Ø., Hellton, K.H., Frigessi, A., et al.: Covariate selection in high-dimensional generalized linear models with measurement error. J. Comput. Graph. Statist. 27(4), 739–749 (2018)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58(1), 267–288 (1996)
Van De Geer, S.A., Bühlmann, P.: On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3, 1360–1392 (2009)
Vershynin, R.: High-Dimensional Probability (An Introduction with Applications in Data Science). Cambridge University Press, Cambridge (2018)
Wainwright, M.J.: High-Dimensional Statistics: A Non-asymptotic Viewpoint, vol. 48. Cambridge University Press, Cambridge (2019)
Wu, J., Zheng, Z.M., Li, Y., et al.: Scalable interpretable learning for multi-response error-in-variables regression. J. Multivar. Anal. 179, 104644 (2020)
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)
Acknowledgements
Xin Li’s work was supported in part by the National Natural Science Foundation of China (12201496), and the Natural Science Foundation of Shaanxi Province of China (2022JQ-045). Dongya Wu’work was supported in part by he National Natural Science Foundation of China (62103329).
Author information
Authors and Affiliations
Contributions
XL and DW wrote the main manuscript text. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests that are directly or indirectly related to the work submitted for publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, X., Wu, D. Sparse estimation in high-dimensional linear errors-in-variables regression via a covariate relaxation method. Stat Comput 34, 4 (2024). https://doi.org/10.1007/s11222-023-10312-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-023-10312-5