Skip to main content
Log in

A unified framework of constrained regression

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Generalized additive models (GAMs) play an important role in modeling and understanding complex relationships in modern applied statistics. They allow for flexible, data-driven estimation of covariate effects. Yet researchers often have a priori knowledge of certain effects, which might be monotonic or periodic (cyclic) or should fulfill boundary conditions. We propose a unified framework to incorporate these constraints for both univariate and bivariate effect estimates and for varying coefficients. As the framework is based on component-wise boosting methods, variables can be selected intrinsically, and effects can be estimated for a wide range of different distributional assumptions. Bootstrap confidence intervals for the effect estimates are derived to assess the models. We present three case studies from environmental sciences to illustrate the proposed seamless modeling framework. All discussed constrained effect estimates are implemented in the comprehensive R package mboost for model-based boosting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Bollaerts, K., Eilers, P.H.C., van Mechelen, I.: Simple and multiple P-splines regression with shape constraints. Br. J. Math. Stat. Psychol. 59, 451–469 (2006)

    Article  Google Scholar 

  • Bühlmann, P., Hothorn, T.: Boosting algorithms: regularization, prediction and model fitting. Stat. Sci. 22, 477–505 (2007)

    Article  MATH  Google Scholar 

  • Bühlmann, P., Yu, B.: Boosting with the L\(_2\) loss: regression and classification. J. Am. Stat. Assoc. 98, 324–339 (2003)

    Article  MATH  Google Scholar 

  • Conceição, G.M.S., Miraglia, S.G.E.K., Kishi, H.S., Saldiva, P.H.N., Singer, J.M.: Air pollution and child mortality: a time-series study in São Paulo, Brazil. Environ. Health Perspect. 109, 347–350 (2001)

    Google Scholar 

  • Dette, H., Neumeyer, N., Pilz, K.F.: A simple nonparametric estimator of a strictly monotone regression function. Bernoulli 12, 469–490 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • de Leeuw, J., Hornik, K., Mair, P.: Isotone optimization in R: pool-adjacent-violators algorithm (PAVA) and active set methods. J. Stat. Softw. 32, 5 (2009)

    Article  Google Scholar 

  • Eilers, P.H.C.: Unimodal smoothing. J. Chemom. 19, 317–328 (2005)

    Article  Google Scholar 

  • Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties. Stat. Sci. 11, 89–121 (1996). (with discussion)

    Article  MATH  MathSciNet  Google Scholar 

  • Eilers, P.H.C., Marx, B.D.: Splines, knots, and penalties. Wiley Interdiscip. Rev. Comput. Stat. 2, 637–653 (2010)

    Article  Google Scholar 

  • Fahrmeir, L., Kneib, T., Lang, S.: Penalized structured additive regression: a Bayesian perspective. Stat. Sin. 14, 731–761 (2004)

    MATH  MathSciNet  Google Scholar 

  • Fang, Z., Meinshausen, N.: LASSO isotone for high-dimensional additive isotonic regression. J. Comput. Gr. Stat. 21, 72–91 (2012)

    Article  MathSciNet  Google Scholar 

  • Fenske, N., Kneib, T., Hothorn, T.: Identifying risk factors for severe childhood malnutrition by boosting additive quantile regression. J. Am. Stat. Assoc. 106, 494–510 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  • Goldfarb, D., Idnani, A.: Dual and primal-dual methods for solving strictly convex quadratic programs. Numer. Anal., pp. 226–239. Springer-Verlag, Berlin (1982)

  • Goldfarb, D., Idnani, A.: A numerically stable dual method for solving strictly convex quadratic programs. Math. Program. 27, 1–33 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  • Hastie, T., Tibshirani, R.: Varying-coefficient models. J. Royal Stat. Soc. Ser. B (Stat. Methodol.) 55, 757–796 (1993)

    MATH  MathSciNet  Google Scholar 

  • Hofner,B.: Boosting in structured additive models. PhD thesis, LMU München, http://nbn-resolving.de/urn:nbn:de:bvb:19-138053, Verlag Dr. Hut, München (2011)

  • Hofner, B., Hothorn, T., Kneib, T., Schmid, M.: A framework for unbiased model selection based on boosting. J. Comput. Gr. Stat. 20, 956–971 (2011a)

    Article  MathSciNet  Google Scholar 

  • Hofner, B., Müller, J., Hothorn, T.: Monotonicity-constrained species distribution models. Ecology 92, 1895–1901 (2011b)

    Article  Google Scholar 

  • Hofner, B., Hothorn, T., Kneib, T.: Variable selection and model choice in structured survival models. Comput. Stat. 28, 1079–1101 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  • Hofner, B., Boccuto, L., Göker, M.: Controlling false discoveries in high-dimensional situations: Boosting with stability selection, unpublished manuscript (2014a)

  • Hofner, B., Mayr, A., Robinzonov, N., Schmid, M.: Model-based boosting in R: a hands-on tutorial using the R package mboost. Comput. Stat. 29, 3–35 (2014b)

    Article  MATH  MathSciNet  Google Scholar 

  • Hofner, B., Mayr, A., Schmid, M.: gamboostLSS: An R package for model building and variable selection in the GAMLSS framework, http://arxiv.org/abs/1407.1774, arXiv:1407.1774 (2014c)

  • Hothorn, T., Bühlmann, P., Kneib, T., Schmid, M., Hofner, B.: Model-based boosting 2.0. J. Mach. Learn. Res. 11, 2109–2113 (2010)

    MATH  MathSciNet  Google Scholar 

  • Hothorn, T., Brandl, R., Müller, J.: Large-scale model-based assessment of deer-vehicle collision risk. PLOS One 7(2), e29,510 (2012)

    Article  Google Scholar 

  • Hothorn, T., Bühlmann, P., Kneib, T., Schmid, M., Hofner, B.: mboost: Model-Based Boosting. http://CRAN.R-project.org/package=mboost, R package version 2.4-0 (2014a)

  • Hothorn, T., Kneib, T., Bühlmann, P.: Conditional transformation models. J. Royal Stat. Soc. Ser. B Stat. Methodol. 76, 3–27 (2014b)

    Article  Google Scholar 

  • Kneib, T., Hothorn, T., Tutz, G.: Variable selection and model choice in geoadditive regression models. Biometrics 65, 626–634 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Krivobokova, T., Kneib, T., Claeskens, G.: Simultaneous confidence bands for penalized spline estimators. J. Am. Stat. Assoc. 105, 852–863 (2010)

  • Mayr, A., Fenske, N., Hofner, B., Kneib, T., Schmid, M.: Generalized additive models for location, scale and shape for high-dimensional data: a flexible approach based on boosting. J. Royal Stat. Soc. Ser. C Appl. Stat. 61, 403–427 (2012)

    Article  MathSciNet  Google Scholar 

  • Meinshausen, N., Bühlmann, P.: Stability selection. J. Royal Stat. Soc. Ser.B Stat. Methodol. 72, 417–473 (2010). (with discussion)

    Article  Google Scholar 

  • Pya, N.: scam: Shape constrained additive models. http://CRAN.R-project.org/package=scam, R package version 1.1-7 (2014)

  • Pya, N., Wood, S.N.: Shape constrained additive models. Stat. Comput. pp 1–17,doi:10.1007/s11222-013-9448-7 (2014)

  • R Core Team (2014) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/, R version 3.1.1

  • Rigby, R.A., Stasinopoulos, D.M.: Generalized additive models for location, scale and shape. J. Royal Stat. Soc. Ser. C Appl. Stat. 54, 507–554 (2005). (with discussion)

    Article  MATH  MathSciNet  Google Scholar 

  • Saldiva, P., Pope, C.I., Schwartz, J., Dockery, D., Lichtenfels, A., Salge, J., Barone, I., Bohm, G.: Air pollution and mortality in elderly people: a time-series study in São Paulo, Brazil. Arch. Environ. Health 50, 159–164 (1995)

    Article  Google Scholar 

  • Schmid, M., Hothorn, T.: Boosting additive models using component-wise P-splines. Comput. Stat. Data Anal. 53, 298–311 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Schmid, M., Wickler, F., Maloney, K.O., Mitchell, R., Fenske, N., Mayr, A.: Boosted beta regression. PLOS One 8(4), e61623 (2013)

    Article  Google Scholar 

  • Shah, R.D., Samworth, R.J.: Variable selection with error control: another look at stability selection. J. Royal Stat. Soc. Ser. B Stat. Methodol. 75, 55–80 (2013)

    Article  MathSciNet  Google Scholar 

  • Sobotka, F., Kneib, T.: Geoadditive expectile regression. Comput. Stat. Data Anal. 56, 755–767 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  • Sobotka, F., Mirkov, R., Hofner, B., Eilers, P., Kneib, T.: Modelling flow in gas transmission networks using shape-constrained expectile regression, unpublished manuscript (2014)

  • Stache, A., Heller, E., Hothorn, T., Heurich, M.: Activity patterns of European roe deer (Capreolus capreolus) are strongly influenced by individual behaviour. Folia Zool. 62, 67–75 (2013)

    Google Scholar 

  • Wood, S.N.: Generalized Additive Models: An Introduction with R. Chapman & Hall / CRC, London (2006a)

    Google Scholar 

  • Wood, S.N.: Low-rank scale-invariant tensor product smooths for generalized additive mixed models. Biometrics 62, 1025–1036 (2006b)

    Article  MATH  MathSciNet  Google Scholar 

  • Wood, S.N.: Fast stable direct fitting and smoothness selection for generalized additive models. J. Royal Stat. Soc. Ser. B Stat. Methodol. 70, 495–518 (2008)

    Article  MATH  Google Scholar 

  • Wood, S.N.: mgcv: GAMs with GCV/AIC/REML smoothness estimation and GAMMs by PQL. http://CRAN.R-project.org/package=mgcv, (2010). R package version 1.7-2

Download references

Acknowledgments

We thank the “Laboratório de Poluição Atmosférica Experimental, Faculdade de Medicina, Universidade de São Paulo, Brasil”, and Julio M. Singer for letting us use the data on air pollution in São Paulo. We thank Marco Heurich from the Bavarian Forest National Park, Grafenau, Germany, for the roe deer activity data, and Karen A. Brune for linguistic revision of the manuscript. We also thank the Associate Editor and two anonymous reviewers for their stimulating and helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin Hofner.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 302 KB)

Supplementary material 2 (pdf 47 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hofner, B., Kneib, T. & Hothorn, T. A unified framework of constrained regression. Stat Comput 26, 1–14 (2016). https://doi.org/10.1007/s11222-014-9520-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-014-9520-y

Keywords

Navigation