Skip to main content
Log in

Bayesian generalized additive model selection including a fast variational option

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

We use Bayesian model selection paradigms, such as group least absolute shrinkage and selection operator priors, to facilitate generalized additive model selection. Our approach allows for the effects of continuous predictors to be categorized as either zero, linear or non-linear. Employment of carefully tailored auxiliary variables results in Gibbsian Markov chain Monte Carlo schemes for practical implementation of the approach. In addition, mean field variational algorithms with closed form updates are obtained. Whilst not as accurate, this fast variational option enhances scalability to very large data sets. A package in the R language aids use in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Albert, J.H., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88, 669–679 (1993)

    Article  MathSciNet  Google Scholar 

  • Azzalini, A.: sn 2.1.1: the Skew–Normal and related distributions such as the Skew-t and the Unified Skew–Normal. R package (2023). http://azzalini.stat.unipd.it/SN

  • Bhadra, A., Datta, J., Polson, N.G., Willard, B.: Lasso meets horseshoe: a survey. Stat. Sci. 34, 405–427 (2019)

    Article  MathSciNet  Google Scholar 

  • Bürkner, P.-C.: bmrs 2.18.0: Bayesian regression models using Stan. R package (2022). https://r-project.org

  • Carvalho, C.M., Polson, N.G., Scott, J.G.: The horseshoe estimator for sparse signals. Biometrika 97, 465–480 (2010)

    Article  MathSciNet  Google Scholar 

  • Chouldechova, A., Hastie, T.: Generalized additive model selection (2015). arXiv:1506.03850

  • Chouldechova, A., Hastie, T.: gamsel 1.8: fit regularization path for generalized additive models. R package (2022). https://r-project.org

  • Croissant, Y.: Ecdat 0.4: data sets for econometrics. R package (2022). https://r-project.org

  • Eddelbuettel, D., François, R.: Rcpp: seamless R and C++ integration. J. Stat. Softw. 40(8), 1–18 (2011)

    Article  Google Scholar 

  • Gelfand, A.E., Smith, A.F.M.: Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85, 398–409 (1990)

    Article  MathSciNet  Google Scholar 

  • Gelman, A.: Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 1, 515–533 (2006)

    Article  MathSciNet  Google Scholar 

  • George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)

    Article  Google Scholar 

  • Griffin, J.E., Brown, P.J.: Bayesian hyper-lassos with non-convex penalization. Aust. N. Z. J. Stat. 53, 423–442 (2011)

    Article  MathSciNet  Google Scholar 

  • Harezlak, J., Ruppert, D., Wand, M.P.: HRW 1.0: datasets, functions and scripts for semiparametric regression supporting Harezlak, Ruppert & Wand (2018). R package (2021). https://r-project.org

  • Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models. Chapman & Hall, New York (1990)

    Google Scholar 

  • He, V.X., Wand, M.P.: gamselBayes: Bayesian generalized additive model selection. R package version 2.0 (2023). http://cran.r-project.org

  • Ishwaran, H., Rao, J.S.: Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Stat. 33, 730–733 (2005)

    Article  MathSciNet  Google Scholar 

  • Kyung, M., Gill, J., Ghosh, M., Casella, G.: Penalized regression, standard errors, and Bayesian lassos. Bayesian Anal. 5, 369–412 (2010)

    MathSciNet  Google Scholar 

  • Lempers, F.B.: Posterior Probabilities of Alternative Linear Models. Rotterdam University Press, Rotterdam (1971)

    Google Scholar 

  • Merrill, H.R., Tang, X., Bliznyuk, N.: Spatio-temporal additive regression model selection for urban water demand. Stoch. Environ. Res. Risk Assess. 33, 1075–1087 (2019)

    Article  Google Scholar 

  • Michael, J.R., Schucany, W.R., Haas, R.W.: Generating random variates using transformations with multiple roots. Am. Stat. 30, 88–90 (1976)

    Google Scholar 

  • Mitchell, T.J., Beauchamp, J.J.: Bayesian variable selection in linear regression. J. Am. Stat. Assoc. 83, 1023–1032 (1988)

    Article  MathSciNet  Google Scholar 

  • Ngo, L., Wand, M.P.: Smoothing with mixed model software. J. Stat. Softw. 9(1), 1–54 (2004)

    Article  Google Scholar 

  • Ormerod, J.T., Wand, M.P.: Explaining variational approximations. Am. Stat. 64, 140–153 (2010)

    Article  MathSciNet  Google Scholar 

  • Park, T., Casella, G.: The Bayesian lasso. J. Am. Stat. Assoc. 103, 681–686 (2008)

    Article  MathSciNet  Google Scholar 

  • Ravikumar, P., Lafferty, J., Liu, H., Wasserman, L.: Sparse additive models. J. R. Stat. Soc. Ser. B 71, 1009–1030 (2009)

    Article  MathSciNet  Google Scholar 

  • R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2023). https://www.r-project/org/

  • Reich, B.J., Sorlie, C.B., Bondell, H.D.: Variable selection in smoothing spline ANOVA: application to deterministic computer codes. Technometrics 51, 110–120 (2009)

    Article  MathSciNet  Google Scholar 

  • Robert, C.P.: Simulation of truncated normal variates. Stat. Comput. 5, 121–125 (1995)

    Article  Google Scholar 

  • Scheipl, F.: spikeSlabGAM: Bayesian variable selection, model choice and regularization for generalized additive mixed models in R. J. Stat. Softw. 43(14), 1–24 (2011)

    Article  Google Scholar 

  • Scheipl, F.: spikeSlabGAM 1.1: Bayesian variable selection and model choice for generalized additive mixed models. R package (2022). https://github.com/fabian-s/spikeSlabGAM

  • Scheipl, F., Fahrmeir, L., Kneib, T.: Spike-and-slab priors for function selection in structured additive regression models. J. Am. Stat. Assoc. 107, 1518–1532 (2012)

    Article  MathSciNet  Google Scholar 

  • Shively, T.S., Kohn, R., Wood, S.: Variable selection and function estimation in additive nonparametric regression using a data-based prior. J. Am. Stat. Assoc. 94, 777–794 (1999)

    Article  MathSciNet  Google Scholar 

  • Umlauf, N., Klein, N., Zeileis, A., Simon, T.: bamlss 1.2: Bayesian additive models for location, scale, and shape (and beyond). R package (2023a). https://www.bamlss.org

  • Umlauf, N., Kneib, T., Klein, N.: BayesX 0.3: R utilities accompanying the software package BayesX. R package (2023b). https://www.BayesX.org

  • Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families and variational inference. Found. Trends Mach. Learn. 1, 1–305 (2008)

    Article  Google Scholar 

  • Wand, M.P., Ormerod, J.T.: On semiparametric regression with O’Sullivan penalized splines. Aust. N. Z. J. Stat. 50, 179–198 (2008)

    Article  MathSciNet  Google Scholar 

  • Wand, M.P., Ormerod, J.T.: Penalized wavelets: embedding wavelets into semiparametric regression. Electron. J. Stat. 5, 1654–1717 (2011)

    Article  MathSciNet  Google Scholar 

  • Wand, M.P., Ormerod, J.T.: Continued fraction enhancement of Bayesian computing. Stat 1, 31–41 (2012)

    Article  MathSciNet  Google Scholar 

  • Wood, S.N.: Generalized Additive Models: An Introduction with R, 2nd edn. CRC Press, Boca Raton, Florida (2017)

    Book  Google Scholar 

  • Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49–67 (2006)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We are grateful to two reviewers for their comments that led to improvements. This research was supported by Australian Research Council grant DP180100597.

Funding

Australian Research Council (DP180100597).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Virginia X. He.

Ethics declarations

Conflict of interest

The authors report that there are no competing interests to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 355 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, V.X., Wand, M.P. Bayesian generalized additive model selection including a fast variational option. AStA Adv Stat Anal (2023). https://doi.org/10.1007/s10182-023-00490-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10182-023-00490-y

Keywords

Navigation