Advertisement

TEST

, Volume 25, Issue 4, pp 607–626 | Cite as

Statistical inference for generalized additive models: simultaneous confidence corridors and variable selection

  • Shuzhuan Zheng
  • Rong Liu
  • Lijian Yang
  • Wolfgang K. Härdle
Original Paper

Abstract

In spite of widespread use of generalized additive models (GAMs) to remedy the “curse of dimensionality”, there is no well-grounded methodology developed for simultaneous inference and variable selection for GAM in existing literature. However, both are essential in enhancing the capability of statistical models. To this end, we establish simultaneous confidence corridors (SCCs) and a type of Bayesian information criterion (BIC) through the spline-backfitted kernel smoothing techniques proposed in recent articles. To characterize the global features of each non-parametric components, SCCs are constructed for testing their overall trends and entire shapes. By extending the BIC in additive models with identity/trivial link, an asymptotically consistent BIC approach for variable selection is built up in GAM to improve the parsimony of model without loss of prediction accuracy. Simulations and a real example corroborate the above findings.

Keywords

BIC Confidence corridor Extreme value Generalized additive mode Spline-backfitted kernel 

Mathematics Subject Classification

62G08 62G15 62G32 

Notes

Acknowledgments

This work is supported in part by the Jiangsu Specially Appointed Professor Program SR10700111, the Jiangsu Key Discipline Program (Statistics) ZY107002, ZY107992 National Natural Science Foundation of China award 11371272, Research Fund for the Doctoral Program of Higher Education of China award 20133201110002, United States NSF awards DMS 0706518, DMS 1007594, an Michigan State University Dissertation Continuation Fellowship, funding from the National University of Singapore, the Deutsche Forschungsgemeinschaft (DFG) via SFB 649 “Economic Risk”, and the International Research Training Group (IRTG) 1792. The helpful comments from two Reviewers and an Associate Editor are gratefully acknowledged.

Supplementary material

11749_2016_480_MOESM1_ESM.pdf (105 kb)
Supplementary material 1 (pdf 104 KB)

References

  1. Berg D (2007) Bankruptcy prediction by generalized additive models. Appl Stoch Models Bus Ind 23:129–143MathSciNetCrossRefzbMATHGoogle Scholar
  2. Bernhardsen E (2001) A model of bankruptcy prediction. Norges Bank, WPGoogle Scholar
  3. Bickel PJ, Rosenblatt M (1973) On some global measures of the deviations of density function estimates. Ann Stat 1:1071–1095MathSciNetCrossRefzbMATHGoogle Scholar
  4. Cai L, Yang L (2015) A smooth simultaneous confidence band for conditional variance function. TEST 24:632–655MathSciNetCrossRefzbMATHGoogle Scholar
  5. Engelmann B, Hayden E, Tasche D (2003) Testing rating accuracy. Risk 16:82–86Google Scholar
  6. Fan J, Yao Q (2003) Nonlinear Time Series: Nonparametric and Parametric Methods. Springer-Verlag, BerlinCrossRefzbMATHGoogle Scholar
  7. Fan J, Zhang WY (2000) Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scand J Stat 27:715–731MathSciNetCrossRefzbMATHGoogle Scholar
  8. Gu L, Wang L, Härdle W, Yang L (2014) A simultaneous confidence corridor for varying coefficient regression with sparse functional data. TEST 23:806–843MathSciNetCrossRefzbMATHGoogle Scholar
  9. Gu L, Yang L (2015) Oracally efficient estimation for single-index link function with simultaneous confidence band. Electr J Stat 9:1540–1561MathSciNetCrossRefzbMATHGoogle Scholar
  10. Härdle W (1989) Asymptotic maximal deviation of M-smoothers. J Multivariate Anal 29:163–179MathSciNetCrossRefzbMATHGoogle Scholar
  11. Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman and Hall, LondonzbMATHGoogle Scholar
  12. He X, Fung W, Zhu Z (2005) Robust estimation in generalized partial linear models for clustered data. J Am Stat Assoc 100:1176–1184MathSciNetCrossRefzbMATHGoogle Scholar
  13. He X, Zhu Z, Fung W (2002) Estimation in a semiparamtric model for longitudinal data with unspecified dependence structure. Biometrika 89:579–590MathSciNetCrossRefzbMATHGoogle Scholar
  14. Horowitz J, Mammen E (2004) Nonparametric estimation of an additive model with a link function. Ann Stat 32:2412–2443MathSciNetCrossRefzbMATHGoogle Scholar
  15. Huang JZ, Yang L (2004) Identification of nonlinear additive autoregression models. J R Stat Soc Ser B Stat Methodol 66:463–477MathSciNetCrossRefzbMATHGoogle Scholar
  16. Linton OB (1997) Efficient estimation of additive nonparametric regression models. Biometrika 84:469–473MathSciNetCrossRefzbMATHGoogle Scholar
  17. Linton OB, Härdle W (1996) Estimation of additive regression models with known links. Biometrika 83:529–540MathSciNetCrossRefzbMATHGoogle Scholar
  18. Liu R, Yang L (2010) Spline-backfitted kernel smoothing of additive coefficient model. Econom Theory 26:29–59MathSciNetCrossRefzbMATHGoogle Scholar
  19. Liu R, Yang L, Härdle W (2013) Oracally efficient two-step estimation of generalized additive model. J Am Stat Assoc 108:619–631MathSciNetCrossRefzbMATHGoogle Scholar
  20. Ma S, Yang L (2011) Spline-backfitted kernel smoothing of partially linear additive model. J Stat Plan Inference 141:204–219MathSciNetCrossRefzbMATHGoogle Scholar
  21. Ma S, Yang L, Carroll RJ (2012) Simultaneous confidence band for sparse longitudinal regression. Stat Sin 22:95–122MathSciNetzbMATHGoogle Scholar
  22. Ryser M, Denzler S (2009) Selecting credit rating models: a cross-validation-based comparison of discriminatory power. Financ Mark Portf Manag 23:187–203CrossRefGoogle Scholar
  23. Severini T, Staniswalis J (1994) Quasi-likelihood estimation in semiparametric models. J Am Stat Assoc 89:501–511MathSciNetCrossRefzbMATHGoogle Scholar
  24. Shina Y, Moore W (2003) Explaining credit rating differences between Japanese and U.S. agencies. Rev Finan Econ 12:327–344CrossRefGoogle Scholar
  25. Stone CJ (1985) Additive regression and other nonparametric models. Ann Statist 13:689–705MathSciNetCrossRefzbMATHGoogle Scholar
  26. Stone CJ (1986) The dimensionality reduction principle for generalized additive models. Ann Statist 14:590–606MathSciNetCrossRefzbMATHGoogle Scholar
  27. Tusnady G (1977) A remark on the approximation of the sample distribution function in the multidimensional case. Period Math Hungar 8:53–55MathSciNetCrossRefzbMATHGoogle Scholar
  28. Wang J, Liu R, Cheng F, Yang L (2014) Oracally efficient estimation of autoregressive error distribution with simultaneous confidence band. Ann Stat 42:654–668MathSciNetCrossRefzbMATHGoogle Scholar
  29. Wang L, Yang L (2007) Spline-backfitted kernel smoothing of nonlinear additive autoregression model. Ann Stat 35:2474–2503MathSciNetCrossRefzbMATHGoogle Scholar
  30. Wang L, Yang L (2009) Spline estimation of single index model. Stat Sin 19:765–783zbMATHGoogle Scholar
  31. Wang L, Li H, Huang J (2008) Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J Am Stat Assoc 103:1556–1569MathSciNetCrossRefzbMATHGoogle Scholar
  32. Wiesenfarth M, Krivobokova T, Klasen S, Sperlich S (2012) Direct Simultaneous Inference in Additive Models and its Application to Model Undernutrition. J Am Stat Assoc 107:1286–1296MathSciNetCrossRefzbMATHGoogle Scholar
  33. Wu W, Zhao Z (2007) Inference of trends in time series. J R Stat Soc Ser B Stat Methodol 69:391–410MathSciNetCrossRefGoogle Scholar
  34. Xia Y (1998) Bias-corrected confidence bands in nonparametric regression. J R Stat Soc Ser B Stat Methodol 60:797–811MathSciNetCrossRefzbMATHGoogle Scholar
  35. Xue L, Yang L (2006) Additive coefficient modeling via polynomial spline. Stat Sin 16:1423–1446MathSciNetzbMATHGoogle Scholar
  36. Yang L, Sperlich S, Härdle W (2003) Derivative estimation and testing in generalized additive models. J Stat Plan Inference 115:521–542MathSciNetCrossRefzbMATHGoogle Scholar
  37. Zhang H, Lin Y (2006) Component selection and smoothing for nonparametric regression in exponential families. Stat Sin 16:1021–1042MathSciNetzbMATHGoogle Scholar
  38. Zhao Z, Wu W (2008) Confidence bands in nonparametric time series regression. Ann Stat 36:1854–1878MathSciNetCrossRefzbMATHGoogle Scholar
  39. Zheng S, Yang L, Härdle W (2014) A smooth simultaneous confidence corridor for the mean of sparse functional data. J Am Stat Assoc 109:661–673MathSciNetCrossRefGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2016

Authors and Affiliations

  • Shuzhuan Zheng
    • 1
    • 2
  • Rong Liu
    • 3
  • Lijian Yang
    • 4
  • Wolfgang K. Härdle
    • 5
    • 6
  1. 1.Center for Advanced Statistics and Econometrics ResearchSoochow UniversitySuzhouChina
  2. 2.Department of EconomicsColumbia UniversityNew YorkUSA
  3. 3.Department of Mathematics and StatisticsUniversity of ToledoToledoUSA
  4. 4.Center for Statistical Science and Department of Industrial EngineeringTsinghua UniversityBeijingChina
  5. 5.C.A.S.E.-Center for Applied Statistics and EconomicsHumboldt-Universität zu BerlinBerlinGermany
  6. 6.Lee Kong Chian School of Business, Sim Kee Boon Institute for Financial EconomicsSingapore Management UniversitySingaporeSingapore

Personalised recommendations