Skip to main content
Log in

Nonparametric estimation of the link function including variable selection

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Nonparametric methods for the estimation of the link function in generalized linear models are able to avoid bias in the regression parameters. But for the estimation of the link typically the full model, which includes all predictors, has been used. When the number of predictors is large these methods fail since the full model cannot be estimated. In the present article a boosting type method is proposed that simultaneously selects predictors and estimates the link function. The method performs quite well in simulations and real data examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Antoniadis, A., Gregoire, G., McKeague, I.W.: Bayesian estimation in single-index models. Stat. Sin. 14, 1147–1164 (2004)

    MathSciNet  MATH  Google Scholar 

  • Bühlmann, P., Hothorn, T.: Boosting algorithms: regularization, prediction and model fitting (with discussion). Stat. Sci. 22, 477–505 (2007)

    Article  Google Scholar 

  • Bühlmann, P., Yu, B.: Boosting with the L2 loss: regression and classification. J. Am. Stat. Assoc. 98, 324–339 (2003)

    Article  MATH  Google Scholar 

  • Carroll, R.J., Fan, J., Gijbels, I., Wand, M.P.: Generalized partially linear single-index models. J. Am. Stat. Assoc. 92, 477–489 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Cui, X., Härdle, W.K., Zhu, L.: Generalized single index models: the EFM approach. Discussion Paper 50, SFB 649, Humboldt University Berlin, Economic Risk (2009)

  • Czado, Y., Santner, T.: The effect of link misspecification on binary regression inference. J. Stat. Plan. Inference 33, 213–231 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Dep, P., Trivedi, P.K.: Demand for medical care by the elderly: a finite mixture approach. J. Appl. Econom. 12, 313–336 (1997)

    Article  Google Scholar 

  • Dierckx, P.: Curve and Surface Fitting with Splines. Oxford Science Publications, Oxford (1993)

    MATH  Google Scholar 

  • Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Math. Stat. 32, 407–499 (2004)

    MathSciNet  MATH  Google Scholar 

  • Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and Penalties. Stat. Sci. 11, 89–121 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Fan, J., Li, R.: Variable selection via nonconcave penalize likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman, J.H., Stützle, W.: Projection pursuit regression. J. Am. Stat. Assoc. 76, 817–823 (1981)

    Article  Google Scholar 

  • Gaiffas, S., Lecue, G.: Optimal rates and adaptations in the single-index model using aggregation. Electron. J. Stat. 1, 538–573 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Gertheiss, J., Hogger, S., Oberhauser, C., Tutz, G.: Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. J. R. Stat. Soc. Ser. C (2011). doi:10.1111/j.1467-9876.2010.00753.x

  • Härdle, W., Hall, P., Ichimura, H.: Optimal smoothing in single-index models (1993)

  • Hastie, T.: Comment: boosting algorithms: regularization, prediction and model fitting. Stat. Sci. 22(4) (2007)

  • Hothorn, T., Bühlmann, P., Kneib, T., Schmid, M., Hofner, B.: mboost: model-based boosting. R package version 2.0–0. (2009)

  • Hothorn, T., Buehlmann, P., Kneib, T., Schmid, M., Hofner, B.: Model-based boosting 2.0. J. Mach. Learn. Res. 11, 2109–2113 (2010)

    MathSciNet  Google Scholar 

  • Hristache, M., Juditsky, A., Spokoiny, V.: Direct estimation of the index coefficient in a single-index model. Ann. Stat. 29, 595–623 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • James, G.M., Radchenko, P.: A generalized dantzig selector with shrinkage tuning. Biometrika 127–142 (2008)

  • Klein, R.L., Spady, R.H.: An efficient semiparametric estimator for binary response models. Econometrica 61, 387–421 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Leitenstorfer, F., Tutz, G.: Estimation of single-index models based on boosting techniques. Stat. Model. 11, 183–197 (2011)

    Article  Google Scholar 

  • Lokhorst, J., Venables, B., Turlach, B., Maechler, M.: lasso2: L1 constrained estimation aka ‘lasso’. R package version 1.2-6 (2007)

  • Maron, M.: Threshold effect of eucalypt density on an aggressive avian competitor. Biol. Conserv. 136, 100–107 (2007)

    Article  Google Scholar 

  • Muggeo, V.M.R., Ferrara, G.: Fitting generalized linear models with unspecified link function: a p-spline approach. Comput. Stat. Data Anal. 52(5) (2008)

  • Naik, P.A., Tsai, C.-L.: Single-index model selection. Biom. Trust 88, 821–832 (2001)

    MathSciNet  MATH  Google Scholar 

  • Park, M.Y., Hastie, T.: An l1 regularization-path algorithm for generalized linear models. Preprint, Department of Statistics, Stanford University (2006)

  • Powell, J.L., Stock, J.H., Stoker, T.M.: Semiparametric estimation of index coefficients. Econometrica 57, 1403–1430 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  • Ramsey, J.O., Silverman, B.W.: Functional Data Analysis. Springer, New York (2005)

    Google Scholar 

  • Ramsay, J.O., Wickham, H., Graves, S., Hooker, G.: fda: functional data analysis R package version 2.2.5 (2010)

  • Ruckstuhl, A., Welsh, A.: Reference bands for nonparametrically estimated link functions. J. Comput. Graph. Stat. 8(4), 699–714 (1999)

    Article  Google Scholar 

  • Stoker, T.M.: Consistent estimation of scaled coefficients. Econometrica 54, 1461–1481 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  • Turlach, B.A.: quadprog: functions to solve quadratic programming problems. R package version 1.4-11, S original by Berwin A. Turlach, R port by Andreas Weingessel (2009)

  • Tutz, G., Binder, H.: Generalized additive modelling with implicit variable selection by likelihood based boosting. Biometrics 62, 961–971 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Weisberg, S., Welsh, A.H.: Adapting for the missing link. Ann. Stat. 22, 1674–1700 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Yu, Y., Ruppert, D.: Penalized spline estimation for partially linear single-index models. J. Am. Stat. Assoc. 97, 1042–1054 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Zeileis, A.: Object-oriented computation of sandwich estimator. J. Stat. Softw. 16(9) (2006)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gerhard Tutz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tutz, G., Petry, S. Nonparametric estimation of the link function including variable selection. Stat Comput 22, 545–561 (2012). https://doi.org/10.1007/s11222-011-9246-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-011-9246-z

Keywords

Navigation