Advertisement

Journal of Systems Science and Complexity

, Volume 31, Issue 5, pp 1350–1361 | Cite as

Feature Screening for Nonparametric and Semiparametric Models with Ultrahigh-Dimensional Covariates

  • Junying Zhang
  • Riquan Zhang
  • Jiajia Zhang
Article
  • 85 Downloads

Abstract

This paper considers the feature screening and variable selection for ultrahigh dimensional covariates. The new feature screening procedure base on conditional expectation which is used to differentiate whether an explanatory variable contributes to a response variable or not, without requiring a specific parametric form of the underlying data model. The authors estimate the marginal conditional expectation by kernel regression estimator. The proposed method is showed to have sure screen property. The authors propose an iterative kernel estimator algorithm to reduce the ultrahigh dimensionality to an appropriate scale. Simulation results and real data analysis demonstrate the proposed method works well and performs better than competing methods.

Keywords

Conditional expectation dimensionality reduction nonparametric and semiparametric models ultrahigh dimension variable screening 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Fan J and Lü J, Sure independence screening for ultrahigh dimensional feature space (with discussion), Journal of the Royal Statistical Society: Series B, 2008, 70: 849–911.MathSciNetCrossRefGoogle Scholar
  2. [2]
    Hastie T, Tibshirani R, and Friedman J, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., Springer, New York, 2009.CrossRefzbMATHGoogle Scholar
  3. [3]
    Buhlmann P and van de Geer S, Statistics for High-Dimensional Data: Methods, Theory and Applications, Springer, Heidelberg, 2011.CrossRefzbMATHGoogle Scholar
  4. [4]
    Hall P and Miller H, Using generalized correlation to effect variable selection in very high dimensional problems, Journal of Computational and Graphical Statistics, 2009, 18: 533–550.MathSciNetCrossRefGoogle Scholar
  5. [5]
    Fan J and Song R, Sure independence screening in generalized linear models with NPdimensionality, Annals of Statistics, 2010, 38: 3567–3604.MathSciNetCrossRefzbMATHGoogle Scholar
  6. [6]
    Fan J, Feng Y, and Song R, Nonparametric independence screening in sparse ultra-high-dimensional additive models, Journal of the American Statistics Association, 2011, 106: 544–557.MathSciNetCrossRefzbMATHGoogle Scholar
  7. [7]
    Fan J and Gijbels I, Local Polynomial Modeling and Its Applications, Chapman and Hall, New York, 1996.zbMATHGoogle Scholar
  8. [8]
    Härdle W, Applied nonparametric regression, Econometric Society Monographs 19, Cambridge University Press, Cambridge, 1990.CrossRefGoogle Scholar
  9. [9]
    Wang H, forward regression for ultra-high dimension variable screening, Journal of the American Statistics Association, 2009, 104: 1512–1524.CrossRefzbMATHGoogle Scholar
  10. [10]
    Fan J and Li R, Variable selection via nonconcave penalized likelihood and it oracle properties, Journal of the American Statistics Association, 2001, 96: 1348–1360.MathSciNetCrossRefzbMATHGoogle Scholar
  11. [11]
    Fan J, Samworth R, and Wu Y, Ultra-dimensional variable selection via independent learning: Beyond the linear model, Journal of Machine Learning Research, 2009, 10: 1829–1853.Google Scholar
  12. [12]
    Ruppert D, Sheather S, and Wand M, An effective bandwidth selector for local least squares regression, Journal of the American Statistics Association, 1995, 90: 1257–1270.MathSciNetCrossRefzbMATHGoogle Scholar
  13. [13]
    Ravikumar P, Liu H, Lafferty J, et al., Spam: Sparse additive models, Journal of the Royal Statistical Society: Series B, 2009, 71: 1009–1030.MathSciNetCrossRefGoogle Scholar
  14. [14]
    Chiang A, Beck J, Yen H, et al, Homozygosity mapping with SNP arrays identifies TRIM32, an E3 ubiquitin ligase, as a bardetiedl syndrome gene (BBS11), Proceedings of the National Academy of Sciences, 2006, 103: 6287–6292.Google Scholar

Copyright information

© Institute of Systems Science, Academy of Mathematics and Systems Science, CAS and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of StatisticsEast China Normal UniversityShanghaiChina
  2. 2.Department of MathematicsTaiyuan University of TechnologyTaiyuanChina
  3. 3.Department of MathematicsShanxi Datong UniversityDatongChina
  4. 4.Department of Epidemiology and BiostatisticsUniversity of South CarolinaColumbiaUSA

Personalised recommendations