TEST

, Volume 25, Issue 4, pp 607–626 | Cite as

Statistical inference for generalized additive models: simultaneous confidence corridors and variable selection

  • Shuzhuan Zheng
  • Rong Liu
  • Lijian Yang
  • Wolfgang K. Härdle
Original Paper

Abstract

In spite of widespread use of generalized additive models (GAMs) to remedy the “curse of dimensionality”, there is no well-grounded methodology developed for simultaneous inference and variable selection for GAM in existing literature. However, both are essential in enhancing the capability of statistical models. To this end, we establish simultaneous confidence corridors (SCCs) and a type of Bayesian information criterion (BIC) through the spline-backfitted kernel smoothing techniques proposed in recent articles. To characterize the global features of each non-parametric components, SCCs are constructed for testing their overall trends and entire shapes. By extending the BIC in additive models with identity/trivial link, an asymptotically consistent BIC approach for variable selection is built up in GAM to improve the parsimony of model without loss of prediction accuracy. Simulations and a real example corroborate the above findings.

Keywords

BIC Confidence corridor Extreme value Generalized additive mode Spline-backfitted kernel 

Mathematics Subject Classification

62G08 62G15 62G32 

Supplementary material

11749_2016_480_MOESM1_ESM.pdf (105 kb)
Supplementary material 1 (pdf 104 KB)

References

  1. Berg D (2007) Bankruptcy prediction by generalized additive models. Appl Stoch Models Bus Ind 23:129–143MathSciNetCrossRefMATHGoogle Scholar
  2. Bernhardsen E (2001) A model of bankruptcy prediction. Norges Bank, WPGoogle Scholar
  3. Bickel PJ, Rosenblatt M (1973) On some global measures of the deviations of density function estimates. Ann Stat 1:1071–1095MathSciNetCrossRefMATHGoogle Scholar
  4. Cai L, Yang L (2015) A smooth simultaneous confidence band for conditional variance function. TEST 24:632–655MathSciNetCrossRefMATHGoogle Scholar
  5. Engelmann B, Hayden E, Tasche D (2003) Testing rating accuracy. Risk 16:82–86Google Scholar
  6. Fan J, Yao Q (2003) Nonlinear Time Series: Nonparametric and Parametric Methods. Springer-Verlag, BerlinCrossRefMATHGoogle Scholar
  7. Fan J, Zhang WY (2000) Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scand J Stat 27:715–731MathSciNetCrossRefMATHGoogle Scholar
  8. Gu L, Wang L, Härdle W, Yang L (2014) A simultaneous confidence corridor for varying coefficient regression with sparse functional data. TEST 23:806–843MathSciNetCrossRefMATHGoogle Scholar
  9. Gu L, Yang L (2015) Oracally efficient estimation for single-index link function with simultaneous confidence band. Electr J Stat 9:1540–1561MathSciNetCrossRefMATHGoogle Scholar
  10. Härdle W (1989) Asymptotic maximal deviation of M-smoothers. J Multivariate Anal 29:163–179MathSciNetCrossRefMATHGoogle Scholar
  11. Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman and Hall, LondonMATHGoogle Scholar
  12. He X, Fung W, Zhu Z (2005) Robust estimation in generalized partial linear models for clustered data. J Am Stat Assoc 100:1176–1184MathSciNetCrossRefMATHGoogle Scholar
  13. He X, Zhu Z, Fung W (2002) Estimation in a semiparamtric model for longitudinal data with unspecified dependence structure. Biometrika 89:579–590MathSciNetCrossRefMATHGoogle Scholar
  14. Horowitz J, Mammen E (2004) Nonparametric estimation of an additive model with a link function. Ann Stat 32:2412–2443MathSciNetCrossRefMATHGoogle Scholar
  15. Huang JZ, Yang L (2004) Identification of nonlinear additive autoregression models. J R Stat Soc Ser B Stat Methodol 66:463–477MathSciNetCrossRefMATHGoogle Scholar
  16. Linton OB (1997) Efficient estimation of additive nonparametric regression models. Biometrika 84:469–473MathSciNetCrossRefMATHGoogle Scholar
  17. Linton OB, Härdle W (1996) Estimation of additive regression models with known links. Biometrika 83:529–540MathSciNetCrossRefMATHGoogle Scholar
  18. Liu R, Yang L (2010) Spline-backfitted kernel smoothing of additive coefficient model. Econom Theory 26:29–59MathSciNetCrossRefMATHGoogle Scholar
  19. Liu R, Yang L, Härdle W (2013) Oracally efficient two-step estimation of generalized additive model. J Am Stat Assoc 108:619–631MathSciNetCrossRefMATHGoogle Scholar
  20. Ma S, Yang L (2011) Spline-backfitted kernel smoothing of partially linear additive model. J Stat Plan Inference 141:204–219MathSciNetCrossRefMATHGoogle Scholar
  21. Ma S, Yang L, Carroll RJ (2012) Simultaneous confidence band for sparse longitudinal regression. Stat Sin 22:95–122MathSciNetMATHGoogle Scholar
  22. Ryser M, Denzler S (2009) Selecting credit rating models: a cross-validation-based comparison of discriminatory power. Financ Mark Portf Manag 23:187–203CrossRefGoogle Scholar
  23. Severini T, Staniswalis J (1994) Quasi-likelihood estimation in semiparametric models. J Am Stat Assoc 89:501–511MathSciNetCrossRefMATHGoogle Scholar
  24. Shina Y, Moore W (2003) Explaining credit rating differences between Japanese and U.S. agencies. Rev Finan Econ 12:327–344CrossRefGoogle Scholar
  25. Stone CJ (1985) Additive regression and other nonparametric models. Ann Statist 13:689–705MathSciNetCrossRefMATHGoogle Scholar
  26. Stone CJ (1986) The dimensionality reduction principle for generalized additive models. Ann Statist 14:590–606MathSciNetCrossRefMATHGoogle Scholar
  27. Tusnady G (1977) A remark on the approximation of the sample distribution function in the multidimensional case. Period Math Hungar 8:53–55MathSciNetCrossRefMATHGoogle Scholar
  28. Wang J, Liu R, Cheng F, Yang L (2014) Oracally efficient estimation of autoregressive error distribution with simultaneous confidence band. Ann Stat 42:654–668MathSciNetCrossRefMATHGoogle Scholar
  29. Wang L, Yang L (2007) Spline-backfitted kernel smoothing of nonlinear additive autoregression model. Ann Stat 35:2474–2503MathSciNetCrossRefMATHGoogle Scholar
  30. Wang L, Yang L (2009) Spline estimation of single index model. Stat Sin 19:765–783MATHGoogle Scholar
  31. Wang L, Li H, Huang J (2008) Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J Am Stat Assoc 103:1556–1569MathSciNetCrossRefMATHGoogle Scholar
  32. Wiesenfarth M, Krivobokova T, Klasen S, Sperlich S (2012) Direct Simultaneous Inference in Additive Models and its Application to Model Undernutrition. J Am Stat Assoc 107:1286–1296MathSciNetCrossRefMATHGoogle Scholar
  33. Wu W, Zhao Z (2007) Inference of trends in time series. J R Stat Soc Ser B Stat Methodol 69:391–410MathSciNetCrossRefGoogle Scholar
  34. Xia Y (1998) Bias-corrected confidence bands in nonparametric regression. J R Stat Soc Ser B Stat Methodol 60:797–811MathSciNetCrossRefMATHGoogle Scholar
  35. Xue L, Yang L (2006) Additive coefficient modeling via polynomial spline. Stat Sin 16:1423–1446MathSciNetMATHGoogle Scholar
  36. Yang L, Sperlich S, Härdle W (2003) Derivative estimation and testing in generalized additive models. J Stat Plan Inference 115:521–542MathSciNetCrossRefMATHGoogle Scholar
  37. Zhang H, Lin Y (2006) Component selection and smoothing for nonparametric regression in exponential families. Stat Sin 16:1021–1042MathSciNetMATHGoogle Scholar
  38. Zhao Z, Wu W (2008) Confidence bands in nonparametric time series regression. Ann Stat 36:1854–1878MathSciNetCrossRefMATHGoogle Scholar
  39. Zheng S, Yang L, Härdle W (2014) A smooth simultaneous confidence corridor for the mean of sparse functional data. J Am Stat Assoc 109:661–673MathSciNetCrossRefGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2016

Authors and Affiliations

  • Shuzhuan Zheng
    • 1
    • 2
  • Rong Liu
    • 3
  • Lijian Yang
    • 4
  • Wolfgang K. Härdle
    • 5
    • 6
  1. 1.Center for Advanced Statistics and Econometrics ResearchSoochow UniversitySuzhouChina
  2. 2.Department of EconomicsColumbia UniversityNew YorkUSA
  3. 3.Department of Mathematics and StatisticsUniversity of ToledoToledoUSA
  4. 4.Center for Statistical Science and Department of Industrial EngineeringTsinghua UniversityBeijingChina
  5. 5.C.A.S.E.-Center for Applied Statistics and EconomicsHumboldt-Universität zu BerlinBerlinGermany
  6. 6.Lee Kong Chian School of Business, Sim Kee Boon Institute for Financial EconomicsSingapore Management UniversitySingaporeSingapore

Personalised recommendations