Skip to main content

Bayesian Variable Selection for Linear Models Using I-Priors

  • Chapter
  • First Online:
Theoretical, Modelling and Numerical Simulations Toward Industry 4.0

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 319))

  • 264 Accesses

Abstract

The Bayesian approach to modelling differs from the frequentist approach primarily in the supplementation of additional information about the parameters to the data. If we specify a “good” prior, in the sense that the prior nudges the likelihood in the right direction, then the estimates will also be good. This is what we aim to do in the case of variable selection problems, whereby the Bayesian method reduces the selection problem to one of estimation from a true search of the variable space for the model which optimises a certain criterion. We contribute to the vastly available literature of variable selection methods by using I-priors [5]—a class of Gaussian distributions which has the distinguishing property of having covariance proportional to the Fisher information (of the model parameters). The original motivation behind the I-prior methodology was to develop a novel unifying approach to various regression models. In this work, we detail the I-prior model used, and showcase some simulation results and several real-world applications in which the I-prior performs favourably compared to other prior distributions and/or variable selection techniques in terms of model size, \(R^2\), predictive ability, and so on.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Briefly, in testing a point null hypothesis of the mean of a normally distributed parameter, the null hypothesis is increasingly accepted as the prior variance of the parameter approaches infinity, regardless of evidence for or against the null. The paradox is also termed Jeffreys-Lindley paradox [40].

  2. 2.

    The Jeffreys prior for a parameter \(\theta \) is defined as \(p(\theta ) \propto \vert {\mathcal I}(\theta ) \vert ^{1/2}\) [21].

  3. 3.

    For any row of \({\mathbf{X}}\), \(\text {Cov}[X_j, X_k] = \text {Cov}[Z_j + U, Z_k + U] = \text {Var }[ U] = 1\), and \(\text {Var}[X_j] = \text {Var}[Z_j + U] = 2\). Thus, \(\text {Corr}[X_j, X_k] = \text {Cov}[X_j, X_k] / (\text {Var}[X_j]\text {Var}[X_k])^{1/2} = 1/2\).

  4. 4.

    Since the total model space used was different between our method, C&M and B&F, it does not make sense to compare posterior model probabilities which we obtained. C&M reported a model probability of 0.491 for their model, but this model was not selected at all using the I-prior.

  5. 5.

    Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under CC BY-SA 3.0. Created using the ggmap package [22] in R.

References

  1. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: 2nd International Symposium on Information Theory, pp. 267–281. Akadémiai Kiadó (1973)

    Google Scholar 

  2. Banner, K.M., Higgs, M.D.: Considerations for assessing model averaging of regression coeffcients. Ecol. Appl. 27(1), 78–93 (2017). https://doi.org/10.1002/eap.1419

  3. Barbieri, M.M., Berger, J.O.: Optimal predictive model selection. Ann. Stat. 32(3), 870–897 (2004). https://doi.org/10.1214/009053604000000238

  4. Bergsma, W.: Regression with I-priors. J. Econom. Stat. (2019). https://doi.org/10.1016/j.ecosta.2019.10.002

  5. Bergsma, W.: Regression with I-priors. Econom. Stat. 14, 89–111 (2020)

    Google Scholar 

  6. Berlinet, A., Thomas-Agnan, C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer, Boston, MA (2004). ISBN 978-1-4613-4792-7. https://doi.org/10.1007/978-1-4419-9096-9

  7. Breiman, L., Friedman, J.H.: Estimating optimal transformations for multiple regression and correlation. J. Am. Stat. Assoc. 80(391), 590–598 (1985). https://doi.org/10.1080/01621459.1985.10478157

  8. Cade, B.S.: Model averaging and muddled multimodel inferences. Ecology 96(9), 2370–2382 (2015). https://doi.org/10.1890/14-1639.1

  9. Casella, G., Javier Girón, F., Lina Martnez, M., Moreno, E.: Consistency of Bayesian procedures for variable selection. Ann. Stat. 37(3), 1207–1228 (2009). https://doi.org/10.1214/08-AOS606

  10. Casella, G., Moreno, E.: Objective Bayesian variable selection. J. Am. Stat. Assoc. 101(473), 157–167 (2006). https://doi.org/10.1198/016214505000000646

  11. Chipman, H., George, E.I., McCulloch, R.E.: The practical implementation of Bayesian model selection. In: Lahiri P. (ed.) Model Selection, vol. 38, pp. 65–134. Institute of Mathematical Statistics (2001). https://doi.org/10.1214/lnms/1215540964

  12. Dellaportas, P., Forster, J.J., Ntzoufras, I.: On Bayesian model and variable selection using MCMC. Stat. Comput. 12(1), 27–36 (2002). https://doi.org/10.1023/A:1013164120801

  13. Fouskakis, D., Draper, D.: Comparing stochastic optimization methods for variable selection in binary outcome prediction, with application to health policy. J. Am. Stat. Assoc. 103(484), 1367–1381 (2008). https://doi.org/10.1198/016214508000001048

  14. Friedman, J.H., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2001). ISBN 978-0-387-84857-0. https://doi.org/10.1007/978-0-387-84858-7

  15. George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88(423), 881–889 (1993). https://doi.org/10.2307/2290777

  16. Geweke, J.: Variable selection and model comparison in regression. In: Bernardo, J.M., Berger, J.O., Philip Dawid, A., Smith, A.F.M. (eds.) Bayesian Statistics 5. Proceedings of the Fifth Valencia International Meeting. Oxford University Press (1996). ISBN: 978-0-19-852356-7

    Google Scholar 

  17. Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970). https://doi.org/10.2307/1267351

  18. Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T.: Bayesian model averaging: a tutorial. Stat. Sci. 14(4), 382–401 (1999). https://doi.org/10.1214/ss/1009212519

  19. Jamil, H. (2018). ipriorBVS: Bayesian Variable Selection Using I-priors. R package version 0.1.1. https://github.com/haziqj/ipriorBVS

  20. Jamil, H.: Regression modelling using priors depending on Fisher information covariance kernels (I-priors). Ph.D. thesis, London School of Economics and Political Science (2018)

    Google Scholar 

  21. Jeffreys, H.: An invariant form for the prior probability in estimation problems. Proc. Roy. Soc. A 186(1007), 453–461 (1946). https://doi.org/10.1098/rspa.1946.0056

  22. Kahle, D., Wickham, H.: ggmap: spatial visualization with ggplot2. R J. 5(1), 144–161 (2013)

    Article  Google Scholar 

  23. Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90(430), 773–795 (1995). https://doi.org/10.2307/2291091

  24. Kuo, L., Mallick, B.: Variable selection for regression models. Sankhy Indian J. Stat. Ser. B 601, 65–81

    Google Scholar 

  25. Kyung, M., Gill, J., Ghosh, M., Casella, G.: Penalized regression, standard errors, and Bayesian lassos. Bayesian Anal. 5(2), 369–411 (2010). https://doi.org/10.1214/10-BA607

  26. Lee, K.E., Sha, N., Dougherty, E.R., Vannucci, M., Mallick, B.: Gene selection: a Bayesian variable selection approach. Bioinformatics 19(1), 90–97 (2003). https://doi.org/10.1093/bioinformatics/19.1.90

  27. Leisch, F., Dimitriadou, E.: mlbench: Machine Learning Benchmark Problems. R package version 2.1-1 (2010)

    Google Scholar 

  28. Lindley, D.V.: A statistical paradox. Biometrika 44(1–2), 187–192 (1957). https://doi.org/10.1093/biomet/44.1-2.187

  29. Madigan, D., Raftery, A.E.: Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Am. Stat. Assoc. 89(428), 1535–1546 (1994). https://doi.org/10.2307/2291017

  30. Mallows, C.L.: Some comments on CP. Technometrics 15(4), 661–675 (1973). https://doi.org/10.2307/1267380

  31. McDonald, G.C., Schwing, R.C.: Instabilities of regression estimates relating air pollution to mortality. Technometrics 15(3), 463–481 (1973). https://doi.org/10.2307/1266852

  32. Miller, A.: Subset selection in regression. Chapman & Hall/CRC (2002). ISBN: 978-1-58488-171-1

    Google Scholar 

  33. Mitchell, T.J., Beauchamp, J.J.: Bayesian variable selection in linear regression. J. Am. Stat. Assoc. 83(404), 1023–1032 (1988). https://doi.org/10.2307/2290129

  34. Ntzoufras, I.: Bayesian modeling using WinBUGS. Wiley (2011). ISBN 978-0-470-14114-4. https://doi.org/10.1002/9780470434567

  35. O’Hara, R.B., Sillanpää, M.J.: A review of Bayesian variable selection methods: what, how and which. Bayesian Anal. 4(1), 85–117 (2009). https://doi.org/10.1214/09-BA403

  36. Ormerod, J.T., You, C., Mäller, S.: A variational Bayes approach to variable selection. Electron. J. Stat. 11(2), 3549–3594 (2017). https://doi.org/10.1214/17-EJS1332

  37. Park, T., Casella, G.: The Bayesian Lasso. J. Am. Stat. Assoc. 103(482), 681–686 (2008). https://doi.org/10.1198/016214508000000337

  38. Plummer, M.: JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In: Hornik, K., Leisch, F., Zeileis, A. (eds.) Proceedings of the Third International Workshop on Distributed Statistical Computing (DSC 2003), Vienna, Austria (2003)

    Google Scholar 

  39. Raftery, A.E., Madigan, D., Hoeting, J.A.: Bayesian model averaging for linear regression models. J. Am. Stat. Assoc. 92(437), 179–191 (1997). https://doi.org/10.1080/01621459.1997.10473615

  40. Robert, C.: On the Jeffreys-Lindley paradox. Philos. Sci. 81(2), 216–232 (2014). arXiv: 1303.5973

    Article  MathSciNet  Google Scholar 

  41. SAS Institute Inc.: SAS/STAT(R) 9.2 User’s Guide, 2nd edn. SAS Institute Inc., Cary, NC (2008). ISBN: 978-1-60764-566-5

    Google Scholar 

  42. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978). https://doi.org/10.1214/aos/1176344136

  43. Scott, S.L., Varian, H.R.: Predicting the present with Bayesian structural time series. Int. J. Math. Model. Numer. Optim. 5(1–2), 4–23 (2014). https://doi.org/10.1504/IJMMNO.2014.059942

  44. Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 58(1), 267–288 (1996). https://doi.org/10.1111/j.1467-9868.2011.00771.x

  45. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 67(2), 301–320 (2005). https://doi.org/10.1111/j.1467-9868.2005.00503.x

  46. Zellner, A.: On Assessing Prior Distributions and Bayesian Regression Analysis with g-Prior Distributions. In: Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti. New York: Elsevier, pp. 233–243. (1986)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haziq Jamil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Jamil, H., Bergsma, W. (2021). Bayesian Variable Selection for Linear Models Using I-Priors. In: Abdul Karim, S.A. (eds) Theoretical, Modelling and Numerical Simulations Toward Industry 4.0. Studies in Systems, Decision and Control, vol 319. Springer, Singapore. https://doi.org/10.1007/978-981-15-8987-4_8

Download citation

Publish with us

Policies and ethics