Skip to main content
Log in

Robust Error Density Estimation in Ultrahigh Dimensional Sparse Linear Model

  • Published:
Acta Mathematica Sinica, English Series Aims and scope Submit manuscript

Abstract

This paper focuses on error density estimation in ultrahigh dimensional sparse linear model, where the error term may have a heavy-tailed distribution. First, an improved two-stage refitted cross-validation method combined with some robust variable screening procedures such as RRCS and variable selection methods such as LAD-SCAD is used to obtain the submodel, and then the residual-based kernel density method is applied to estimate the error density through LAD regression. Under given conditions, the large sample properties of the estimator are also established. Especially, we explicitly give the relationship between the sparsity and the convergence rate of the kernel density estimator. The simulation results show that the proposed error density estimator has a good performance. A real data example is presented to illustrate our methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anscombe, F. J., Glynn, W. J.: Distribution of kurtosis statistic for normal statistics. Biometrika, 70, 227–234 (1983)

    MathSciNet  MATH  Google Scholar 

  2. Arslan, O.: Weighted lad-lasso method for robust parameter estimation and variable selection in regression. Computational Statistics & Data Analysis, 56, 1952–1965 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bassett, G., Koenker, R.: Asymptotic theory of least absolute error regression. Journal of the American Statistical Association, 73, 618–622 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bowman, A. W.: An alternative method of cross-validation for the smoothing of density estimates. Biometrika, 71, 353–360 (1984)

    Article  MathSciNet  Google Scholar 

  5. Chai, G., Li, Z.: Asymptotic theory for estimation of error distribution in linear model. Science in China Series A, 36, 408–419 (1993)

    MathSciNet  MATH  Google Scholar 

  6. Chen, X., Bai, Z., Zhao, L., et al.: Asymptotic normality of minimum l1-norm estimates in linear model. Science in China Series A, 33, 1311–1328 (1990)

    Google Scholar 

  7. Cheng, F.: Weak and strong uniform consistency of a kernel error density estimator in nonparametric regression. Journal of Statistical Planning and Inference, 119, 95–107 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  8. Cheng, F.: Asymptotic distributions of error density and distribution function estimators in nonparametric regression. Journal of Statistical Planning and Inference, 128, 327–349 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  9. D’Agostino, R. B.: Transformation to normality of the null distribution of g1. Biometrika, 57, 679–681 (1970)

    MATH  Google Scholar 

  10. Fan, J., Guo, S., Hao, N.: Variance estimation using refitted cross-validation in ultrahigh dimensional regression. Journal of the Royal Statistical Society Series B, 74, 37–65 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  11. Fan, J., Li, R.: Variable selection via nonconvave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–1346 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  12. Fan, J., Lv, J.: Sure independence screening for ultra-high dimensional feature space. Journal of the Royal Statistical Society Series B, 70, 849–911 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  13. Fu, W. J., Knight, K.: Asymptotics for lasso-type estimators. Annals of Statistics, 28, 1356–1378 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  14. Gao, X., Huang, J.: Asymptotic analysis of high-dimensional lad regression with lasso smoother. Statistica Sinica, 20, 187–193 (2010)

    MATH  Google Scholar 

  15. Hall, P.: Laws of the iterated logarithm for nonparametric density estimators. Probability Theory & Related Fields, 56, 47–61 (1981)

    MathSciNet  MATH  Google Scholar 

  16. Hall, P.: Large sample optimality of least squares cross-validation in density estimation. Annals of Statistics, 11, 1156–1174 (1983)

    MathSciNet  MATH  Google Scholar 

  17. Huang, J., Horowitz, J. L., Ma, S.: Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Annals of Statistics, 36, 587–613 (2008)

    MathSciNet  MATH  Google Scholar 

  18. Jarque, C. M., Bera, A. K.: A test for normality of observations and regression residuals. International Statistical Review, 55, 163–172 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  19. Koenker, R., Bassett, G. W.: Regression quantiles. Econometrica, 46, 211–244 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  20. Li, G., Peng, H., Zhang, J., et al.: Robust rank correlation based screening. Annals of Statistics, 40, 1846–1877 (2012)

    MathSciNet  MATH  Google Scholar 

  21. Li, G., Peng, H., Zhu, L.: Nonconcave penalized m-estimation with a diverging number of parameters. Statistica Sinica, 21, 391–419 (2011)

    MathSciNet  MATH  Google Scholar 

  22. Li, R., Zhong, W., Zhu, L.: Feature screening via distance correlation learning. Journal of the American Statistical Association, 107, 1129–1139 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  23. Li, Z.: A study of nonparametric estimation of error distribution in linear model based on l1-norm. Hiroshima Mathematical Journal, 25, 171–205 (1995)

    MathSciNet  Google Scholar 

  24. Liang, H., Hardle, W.: Large sample theory of the estimation of the error distribution for a semiparametric model. Communications in Statistics, 28, 2025–2036 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  25. Maye, J., Gerken, L.: Learning phonemes without minimal pairs. In: Proceedings of the 24th Annual Boston University Conference on Language Development, Vol. 2, 522–533 (2000)

  26. McKean, J., Schrader, R.: Least Absolute Errors Analysis of Variance, Statistical Data Analysis Based on the l1-norm and Related Methods, North Holland, Amsterdam, 1987

  27. Meinshausen, N.: Relaxed lasso. Computational Statistics & Data Analysis, 52, 374–393 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  28. Meinshausen, N., Meier, L., Bühlmann, P.: p-values for high-dimensional regression. Journal of the American Statistical Association, 104, 1671–1681 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  29. Pollard, D.: Asymptotics for least absolute deviation regression estimators. Econometric Theory, 7, 186–199 (1991)

    Article  MathSciNet  Google Scholar 

  30. Portnoy, S., Koenker, R.: The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute-error estimators. Statistical Science, 12, 279–300 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  31. Powell, J.L.: Least absolute deviations estimation for the censored regression model. Journal of Econometrics, 25, 303–325 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  32. Powell, J. L.: Censored regression quantiles. Journal of Econometrics, 32, 143–155 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  33. Pourahmdai, M.: High-Dimensional Covariance Estimation, Wiley, New Jersey, 2013

    Book  Google Scholar 

  34. Rudemo, M.: Empirical choice of histograms and kernel density estimators. Scandinavian Journal of Statistics, 9, 65–78 (1982)

    MathSciNet  MATH  Google Scholar 

  35. Silverman, B. W.: Density Estimation for Statistics and Data Analysis, Chapman and Hall, London, 1986

    MATH  Google Scholar 

  36. Stadler, N., Bühlmann, P., Geer, S.V.D.: l1-penalization for mixture regression models. Test, 19, 209–256 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  37. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B, 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  38. Wahid, A., Khan, D. M., Hussain, I.: Robust adaptive lasso method for parameter’s estimation and variable selection in high-dimensional sparse models. PLoS ONE, 12, 1–17 (2017)

    Article  Google Scholar 

  39. Wang, H., Li, G., Jiang, G.: Robust regression shrinkage and consistent variable selection through the lad-lasso. Journal of Business & Economic Statstics, 25, 347–355 (2007)

    Article  MathSciNet  Google Scholar 

  40. Wang, L.: The l1 penalized lad estimator for high dimensional linear regression. Journal of Multivariate Analysis, 120, 135–151 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  41. Wang, L., Wu, Y., Li, R.: Quantile regression for analyzing heterogeneity in ultra-high dimension. Journal of the American Statistical Association, 107, 214–222 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  42. Wu, Y.: Strong consistency and exponential rate of the “minimum l1-norm” estimates in linear regression models. Computational Statistics & Data Analysis, 6, 285–295 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  43. Zhang, C.: Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 38, 894–942 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  44. Zhong, W., Zhu, L., Li, R., et al.: Regularized quantile regression and robust feature screening for single index models. Statistica Sinica, 26, 69–95 (2016)

    MathSciNet  MATH  Google Scholar 

  45. Zou, F., Cui, H.: Error density estimation in high-dimensional sparse linear model. Annals of the Institute of Statal Mathematics, 72, 427–449 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  46. Zou, F., Cui, H.: RCV-based error density estimation in the ultrahigh dimensional additive model. Science China Mathemetics, 2022, 65, https://doi.org/10.1007/s11425-019-1722-2

Download references

Acknowledgements

The authors thank the Editor, the AE and reviewers for their constructive comments, which have led to an improvement of the earlier version of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heng Jian Cui.

Additional information

Supported by the National Natural Science Foundation of China (Grant No. 11971324) and the State Key Program of National Natural Science Foundation of China (Grant No. 12031016)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zou, F., Cui, H.J. Robust Error Density Estimation in Ultrahigh Dimensional Sparse Linear Model. Acta. Math. Sin.-English Ser. 38, 963–984 (2022). https://doi.org/10.1007/s10114-022-1134-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10114-022-1134-2

Keywords

MR(2010) Subject Classification

Navigation