Advertisement

Inverse Modeling: A Strategy to Cope with Non-linearity

  • Qian Lin
  • Yang Li
  • Jun S. Liu
Chapter
Part of the Springer Handbooks of Computational Statistics book series (SHCS)

Abstract

In the big data era, discovering and modeling potentially non-linear relationships between predictors and responses might be one of the toughest challenges in modern data analysis. Most forward regression modeling procedures are seriously compromised due to the curse of dimension. In this chapter, we show that the inverse modeling idea, originated from the Sliced Inverse Regression (SIR), can help us detect nonlinear relations effectively, and survey a few recent advances, both algorithmically and theoretically, in which the inverse modeling idea leads to unforeseeable benefits in nonlinear variable selection and nonparametric screening.

Keywords

Correlation pursuit Multiple index models Nonparametric screening Sliced inverse regression Sufficient dimension reduction Sub-Gaussian 

References

  1. Adragni KP, Cook RD (2009) Sufficient dimension reduction and prediction in regression. Philos Trans R Soc Lond A Math Phys Eng Sci 367(1906):4385–4405MathSciNetCrossRefGoogle Scholar
  2. Berthet Q, Rigollet P (2013) Complexity theoretic lower bounds for sparse principal component detection. In: Conference on learning theory, pp 1046–1066Google Scholar
  3. Birnbaum A, Johnstone IM, Nadler B, Paul D (2013) Minimax bounds for sparse PCA with noisy high-dimensional data. Ann Stat 41(3):1055MathSciNetCrossRefGoogle Scholar
  4. Cai TT, Ma Z, Wu Y et al (2013) Sparse PCA: optimal rates and adaptive estimation. Ann Stat 41(6):3074–3110MathSciNetCrossRefGoogle Scholar
  5. Candes E, Tao T (2007) The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35(6):2313–2351MathSciNetCrossRefGoogle Scholar
  6. Chen C-H, Li K-C (1998) Can SIR be as popular as multiple linear regression? Stat Sin 8(2):289–316Google Scholar
  7. Cook RD (2007) Fisher lecture: dimension reduction in regression. Stat Sci 22(1):1–26MathSciNetCrossRefGoogle Scholar
  8. Cook RD, Weisberg S (1991) Discussion of a paper by K. C. Li. J Am Stat Assoc 86:328–332Google Scholar
  9. Cui H, Li R, Zhong W (2014) Model-free feature screening for ultrahigh dimensional discriminant analysis. J Am Stat Assoc 110(510):630–641MathSciNetCrossRefGoogle Scholar
  10. Duan N, Li K (1991) Slicing regression: a link-free regression method. Ann Stat 19(2):505–530MathSciNetCrossRefGoogle Scholar
  11. Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B 70(5):849–911MathSciNetCrossRefGoogle Scholar
  12. Fan J, Samworth R, Wu Y (2009) Ultrahigh dimensional feature selection: beyond the linear model. J Mach Learn Res 10:2013–2038Google Scholar
  13. Fan J, Song R et al (2010) Sure independence screening in generalized linear models with np-dimensionality. Ann Stat 38(6):3567–3604MathSciNetCrossRefGoogle Scholar
  14. Fan J, Feng Y, Song R (2011) Nonparametric independence screening in sparse ultra-high-dimensional additive models. J Am Stat Assoc 106(494):544–557MathSciNetCrossRefGoogle Scholar
  15. Hsing T, Carroll R (1992) An asymptotic theory for sliced inverse regression. Ann Stat 20(2):1040–1061MathSciNetCrossRefGoogle Scholar
  16. Jiang B, Liu JS et al (2014) Variable selection for general index models via sliced inverse regression. Ann Stat 42(5):1751–1786MathSciNetCrossRefGoogle Scholar
  17. Johnstone IM, Lu AY (2004) Sparse principal components analysis. Unpublished manuscript, p 7Google Scholar
  18. Li K-C (1991) Sliced inverse regression for dimension reduction. J Am Stat Assoc 86(414):316–327MathSciNetCrossRefGoogle Scholar
  19. Li L (2007) Sparse sufficient dimension reduction. Biometrika 94(3):603–613MathSciNetCrossRefGoogle Scholar
  20. Li L, Cook RD, Nachtsheim CJ (2005) Model-free variable selection. J R Stat Soc Ser B (Stat Methodol) 67(2):285–299MathSciNetCrossRefGoogle Scholar
  21. Li Y, Zhu L-X et al (2007) Asymptotics for sliced average variance estimation. Ann Stat 35(1):41–69MathSciNetCrossRefGoogle Scholar
  22. Li R, Zhong W, Zhu L (2012) Feature screening via distance correlation learning. J Am Stat Assoc 107(499):1129–1139MathSciNetCrossRefGoogle Scholar
  23. Lin Q, Zhao Z, Liu JS (2016) Nonparametric screening in ultrahigh dimensions. Technical Report, Harvard UniversityGoogle Scholar
  24. Lin Q, Zhao Z, Liu JS (2018) On consistency and sparsity for sliced inverse regression in high dimensions. Ann Stat 46(2):580–610MathSciNetCrossRefGoogle Scholar
  25. Neykov M, Lin Q, Liu JS (2016) Signed support recovery for single index models in high-dimensions. Ann Math Sci Appl 1(2):379–426CrossRefGoogle Scholar
  26. Raskutti G, Wainwright MJ, Yu B (2011) Minimax rates of estimation for high-dimensional linear regression over-balls. IEEE Trans Inf Theory 57(10):6976–6994Google Scholar
  27. Szretter ME, Yohai VJ (2009) The sliced inverse regression algorithm as a maximum likelihood procedure. J Stat Plann Inference 139(10):3570–3578MathSciNetCrossRefGoogle Scholar
  28. Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B (Methodol) 58(1):267–288Google Scholar
  29. Vu V, Lei J (2012) Minimax rates of estimation for sparse PCA in high dimensions. In: Artificial intelligence and statistics, pp. 1278–1286Google Scholar
  30. Yu Z, Dong Y, Zhu LX (2016) Trace pursuit: a general framework for model-free variable selection. J Am Stat Assoc 111(514):813–821MathSciNetCrossRefGoogle Scholar
  31. Zhong W, Zhang T, Zhu Y, Liu JS (2012) Correlation pursuit: forward stepwise variable selection for index models. J R Stat Soc Ser B 74(5):849–870MathSciNetCrossRefGoogle Scholar
  32. Zhu L, Miao B, Peng H (2006) On sliced inverse regression with high-dimensional covariates. J Am Stat Assoc 101(474):630–643MathSciNetCrossRefGoogle Scholar
  33. Zhu LP, Li L, Li R, Zhu LX (2011) Model-free feature screening for ultrahigh-dimensional data. J Am Stat Assoc 106(496):1464–1475MathSciNetCrossRefGoogle Scholar
  34. Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–286MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Center for Statistical Science, Department of Industrial EngineeringTsinghua UniversityBeijingChina
  2. 2.Vatic LabsNew York CityUSA
  3. 3.Department of StatisticsHarvard UniversityCambridgeUSA
  4. 4.Center for Statistical ScienceTsinghua UniversityBeijingChina

Personalised recommendations