RCV-based error density estimation in the ultrahigh dimensional additive model

Zou, Feng; Cui, Hengjian

doi:10.1007/s11425-019-1722-2

RCV-based error density estimation in the ultrahigh dimensional additive model

Articles
Published: 27 January 2022

Volume 65, pages 1003–1028, (2022)
Cite this article

Science China Mathematics Aims and scope Submit manuscript

Feng Zou¹ &
Hengjian Cui¹

70 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we mainly study how to estimate the error density in the ultrahigh dimensional sparse additive model, where the number of variables is larger than the sample size. First, a smoothing method based on B-splines is applied to the estimation of regression functions. Second, an improved two-stage refitted cross-validation (RCV) procedure by random splitting technique is used to obtain the residuals of the model, and then the residual-based kernel method is applied to estimate the error density function. Under suitable sparse conditions, the large sample properties of the estimator including the weak and strong consistency, as well as normality and the law of the iterated logarithm are obtained. Especially, the relationship between the sparsity and the convergence rate of the kernel density estimator is given. The methodology is illustrated by simulations and a real data example, which suggests that the proposed method performs well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Maximizing adjusted covariance: new supervised dimension reduction for classification

Article 02 April 2024

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

Overfitting, Model Tuning, and Evaluation of Prediction Performance

References

Anscombe F J, Glynn W J. Distribution of kurtosis statistic for normal statistics. Biometrika, 1983, 70: 227–234
MathSciNet MATH Google Scholar
Breheny P, Huang J. Penalized methods for bi-level variable selection. Stat Interface, 2009, 2: 369–380
Article MathSciNet Google Scholar
Chai G X, Li Z Y. Asymptotic theory for estimation of error distribution in linear model. Sci China Ser A, 1993, 36: 408–419
MathSciNet MATH Google Scholar
Chai G X, Li Z Y, Tian H. Consistent nonparametric estimation of error distributions in linear model. Acta Math Appl Sin, 1991, 7: 246–256
Article MathSciNet Google Scholar
Chen Z, Fan J Q, Li R Z. Error variance estimation in ultrahigh dimensional additive models. J Amer Statist Assoc, 2018, 113: 315–327
Article MathSciNet Google Scholar
Cheng F X. Weak and strong uniform consistency of a kernel error density estimator in nonparametric regression. J Statist Plann Inference, 2004, 119: 95–107
Article MathSciNet Google Scholar
Cheng F X. Asymptotic distributions of error density and distribution function estimators in nonparametric regression. J Statist Plann Inference, 2005, 128: 327–349
Article MathSciNet Google Scholar
D’Agostino R B. Transformation to normality of the null distribution of g₁. Biometrika, 1970, 57: 679–681
MATH Google Scholar
De Boor C. A Practical Guide to Splines. New York: Springer, 2001
MATH Google Scholar
Fan J Q, Feng Y, Song R. Nonparametric independence screening in sparse ultrahigh dimensional additive models. J Amer Statist Assoc, 2011, 106: 544–557
Article MathSciNet Google Scholar
Fan J Q, Guo S J, Hao N. Variance estimation using refitted cross-validation in ultrahigh dimensional regression. J R Stat Soc Ser B Stat Methodol, 2012, 74: 37–65
Article MathSciNet Google Scholar
Fan J Q, Lv J C. Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B Stat Methodol, 2008, 70: 849–911
Article MathSciNet Google Scholar
Friedman J H, Stuetzle W. Projection pursuit regression. J Amer Statist Assoc, 1981, 76: 817–823
Article MathSciNet Google Scholar
Hall P, Miller H. Using generalized correlation to effect variable selection in very high dimensional problems. J Comput Graph Statist, 2009, 18: 533–550
Article MathSciNet Google Scholar
Huang J, Horowitz J L, Wei F. Variable selection in nonparametric additive models. Ann Statist, 2010, 38: 2282–2313
MathSciNet MATH Google Scholar
Huang J, Ma S G, Zhang C H. Adaptive LASSO for sparse high-dimensional regression. Statist Sinica, 2008, 18: 1603–1618
MathSciNet MATH Google Scholar
Jarque C M, Bera A K. Efficient test for normality, homoscedasticity and serial independence of residuals. Econom Lett, 1980, 3: 255–259
Article MathSciNet Google Scholar
Li R Z, Zhong W, Zhu L P. Feature screening via distance correlation learning. J Amer Statist Assoc, 2012, 107: 1129–1139
Article MathSciNet Google Scholar
Liang H, Hardle W. Large sample theory of the estimation of the error distribution for a semiparametric model. Comm Statist Theory Methods, 1999, 28: 2025–2036
Article MathSciNet Google Scholar
Maye J, Gerken L. Learning phonemes without minimal pairs. In: Proceedings of the 24th Annual Boston University Conference on Language Development, vol. 2. Somerville: Cascadilla Press, 2000, 522–533
Google Scholar
Meier L, Geer SVD, Bühlmann P. High-dimensional additive modeling. Ann Statist, 2009, 37: 3779–3821
Article MathSciNet Google Scholar
Pollard D. Convergence of Stochastic Processes. New York: Springer, 1984
Book Google Scholar
Powell J L. Least absolute deviations estimation for the censored regression model. J Econometrics, 1984, 25: 303–325
Article MathSciNet Google Scholar
Ravikumar P, Lafferty J, Liu H, et al. Sparse additive models. J R Stat Soc Ser B Stat Methodol, 2009, 71: 1009–1030
Article MathSciNet Google Scholar
Schumaker L L. Splines Function: Basic Theory. New York: Wiley, 1981
Google Scholar
Silverman B W. Density Estimation for Statistics and Data Analysis. London: Chapman and Hall, 1986
MATH Google Scholar
Stone C J. Additive regression and other nonparametric models. Ann Statist, 1985, 13: 689–705
Article MathSciNet Google Scholar
Wang H S. Forward regression for ultrahigh dimensional variable screening. J Amer Statist Assoc, 2009, 104: 1512–1524
Article MathSciNet Google Scholar
Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol, 2006, 68: 49–67
Article MathSciNet Google Scholar
Xue L. Consistent variable selection in additive models. Statist Sinica, 2009, 19: 1281–1296
MathSciNet MATH Google Scholar
Zhong W. Robust sure independence screening for ultrahigh dimensional non-normal data. Acta Math Appl Sin, 2014, 30: 1885–1896
Article MathSciNet Google Scholar
Zou F, Cui H J. Error density estimation in high-dimensional sparse linear model. Ann Inst Statist Math, 2020, 72: 427–449
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 11971324 and 11471223), Foundation of Science and Technology Innovation Service Capacity Building, Interdisciplinary Construction of Bioinformatics and Statistics, and Academy for Multidisciplinary Studies, Capital Normal University. The authors thank the reviewers for their constructive comments, which have led to an improvement of the earlier version of this paper.

Author information

Authors and Affiliations

School of Mathematical Sciences, Capital Normal University, Beijing, 100048, China
Feng Zou & Hengjian Cui

Authors

Feng Zou
View author publications
You can also search for this author in PubMed Google Scholar
Hengjian Cui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hengjian Cui.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, F., Cui, H. RCV-based error density estimation in the ultrahigh dimensional additive model. Sci. China Math. 65, 1003–1028 (2022). https://doi.org/10.1007/s11425-019-1722-2

Download citation

Received: 04 May 2019
Accepted: 01 July 2020
Published: 27 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11425-019-1722-2

Keywords

MSC(2020)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RCV-based error density estimation in the ultrahigh dimensional additive model

Abstract

Access this article

Similar content being viewed by others

Maximizing adjusted covariance: new supervised dimension reduction for classification

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Overfitting, Model Tuning, and Evaluation of Prediction Performance

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

MSC(2020)

Navigation

RCV-based error density estimation in the ultrahigh dimensional additive model

Abstract

Access this article

Similar content being viewed by others

Maximizing adjusted covariance: new supervised dimension reduction for classification

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Overfitting, Model Tuning, and Evaluation of Prediction Performance

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

MSC(2020)

Search

Navigation