Bayesian Variable Selection for Linear Models Using I-Priors

Jamil, Haziq; Bergsma, Wicher

doi:10.1007/978-981-15-8987-4_8

Haziq Jamil³ &
Wicher Bergsma³

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 319))

264 Accesses

Abstract

The Bayesian approach to modelling differs from the frequentist approach primarily in the supplementation of additional information about the parameters to the data. If we specify a “good” prior, in the sense that the prior nudges the likelihood in the right direction, then the estimates will also be good. This is what we aim to do in the case of variable selection problems, whereby the Bayesian method reduces the selection problem to one of estimation from a true search of the variable space for the model which optimises a certain criterion. We contribute to the vastly available literature of variable selection methods by using I-priors [5]—a class of Gaussian distributions which has the distinguishing property of having covariance proportional to the Fisher information (of the model parameters). The original motivation behind the I-prior methodology was to develop a novel unifying approach to various regression models. In this work, we detail the I-prior model used, and showcase some simulation results and several real-world applications in which the I-prior performs favourably compared to other prior distributions and/or variable selection techniques in terms of model size, \(R^2\), predictive ability, and so on.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Briefly, in testing a point null hypothesis of the mean of a normally distributed parameter, the null hypothesis is increasingly accepted as the prior variance of the parameter approaches infinity, regardless of evidence for or against the null. The paradox is also termed Jeffreys-Lindley paradox [40].
2.
The Jeffreys prior for a parameter \(\theta \) is defined as \(p(\theta ) \propto \vert {\mathcal I}(\theta ) \vert ^{1/2}\) [21].
3.
For any row of \({\mathbf{X}}\), \(\text {Cov}[X_j, X_k] = \text {Cov}[Z_j + U, Z_k + U] = \text {Var }[ U] = 1\), and \(\text {Var}[X_j] = \text {Var}[Z_j + U] = 2\). Thus, \(\text {Corr}[X_j, X_k] = \text {Cov}[X_j, X_k] / (\text {Var}[X_j]\text {Var}[X_k])^{1/2} = 1/2\).
4.
Since the total model space used was different between our method, C&M and B&F, it does not make sense to compare posterior model probabilities which we obtained. C&M reported a model probability of 0.491 for their model, but this model was not selected at all using the I-prior.
5.
Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under CC BY-SA 3.0. Created using the ggmap package [22] in R.

References

Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: 2nd International Symposium on Information Theory, pp. 267–281. Akadémiai Kiadó (1973)
Google Scholar
Banner, K.M., Higgs, M.D.: Considerations for assessing model averaging of regression coeffcients. Ecol. Appl. 27(1), 78–93 (2017). https://doi.org/10.1002/eap.1419
Barbieri, M.M., Berger, J.O.: Optimal predictive model selection. Ann. Stat. 32(3), 870–897 (2004). https://doi.org/10.1214/009053604000000238
Bergsma, W.: Regression with I-priors. J. Econom. Stat. (2019). https://doi.org/10.1016/j.ecosta.2019.10.002
Bergsma, W.: Regression with I-priors. Econom. Stat. 14, 89–111 (2020)
Google Scholar
Berlinet, A., Thomas-Agnan, C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer, Boston, MA (2004). ISBN 978-1-4613-4792-7. https://doi.org/10.1007/978-1-4419-9096-9
Breiman, L., Friedman, J.H.: Estimating optimal transformations for multiple regression and correlation. J. Am. Stat. Assoc. 80(391), 590–598 (1985). https://doi.org/10.1080/01621459.1985.10478157
Cade, B.S.: Model averaging and muddled multimodel inferences. Ecology 96(9), 2370–2382 (2015). https://doi.org/10.1890/14-1639.1
Casella, G., Javier Girón, F., Lina Martnez, M., Moreno, E.: Consistency of Bayesian procedures for variable selection. Ann. Stat. 37(3), 1207–1228 (2009). https://doi.org/10.1214/08-AOS606
Casella, G., Moreno, E.: Objective Bayesian variable selection. J. Am. Stat. Assoc. 101(473), 157–167 (2006). https://doi.org/10.1198/016214505000000646
Chipman, H., George, E.I., McCulloch, R.E.: The practical implementation of Bayesian model selection. In: Lahiri P. (ed.) Model Selection, vol. 38, pp. 65–134. Institute of Mathematical Statistics (2001). https://doi.org/10.1214/lnms/1215540964
Dellaportas, P., Forster, J.J., Ntzoufras, I.: On Bayesian model and variable selection using MCMC. Stat. Comput. 12(1), 27–36 (2002). https://doi.org/10.1023/A:1013164120801
Fouskakis, D., Draper, D.: Comparing stochastic optimization methods for variable selection in binary outcome prediction, with application to health policy. J. Am. Stat. Assoc. 103(484), 1367–1381 (2008). https://doi.org/10.1198/016214508000001048
Friedman, J.H., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2001). ISBN 978-0-387-84857-0. https://doi.org/10.1007/978-0-387-84858-7
George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88(423), 881–889 (1993). https://doi.org/10.2307/2290777
Geweke, J.: Variable selection and model comparison in regression. In: Bernardo, J.M., Berger, J.O., Philip Dawid, A., Smith, A.F.M. (eds.) Bayesian Statistics 5. Proceedings of the Fifth Valencia International Meeting. Oxford University Press (1996). ISBN: 978-0-19-852356-7
Google Scholar
Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970). https://doi.org/10.2307/1267351
Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T.: Bayesian model averaging: a tutorial. Stat. Sci. 14(4), 382–401 (1999). https://doi.org/10.1214/ss/1009212519
Jamil, H. (2018). ipriorBVS: Bayesian Variable Selection Using I-priors. R package version 0.1.1. https://github.com/haziqj/ipriorBVS
Jamil, H.: Regression modelling using priors depending on Fisher information covariance kernels (I-priors). Ph.D. thesis, London School of Economics and Political Science (2018)
Google Scholar
Jeffreys, H.: An invariant form for the prior probability in estimation problems. Proc. Roy. Soc. A 186(1007), 453–461 (1946). https://doi.org/10.1098/rspa.1946.0056
Kahle, D., Wickham, H.: ggmap: spatial visualization with ggplot2. R J. 5(1), 144–161 (2013)
Article Google Scholar
Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90(430), 773–795 (1995). https://doi.org/10.2307/2291091
Kuo, L., Mallick, B.: Variable selection for regression models. Sankhy Indian J. Stat. Ser. B 601, 65–81
Google Scholar
Kyung, M., Gill, J., Ghosh, M., Casella, G.: Penalized regression, standard errors, and Bayesian lassos. Bayesian Anal. 5(2), 369–411 (2010). https://doi.org/10.1214/10-BA607
Lee, K.E., Sha, N., Dougherty, E.R., Vannucci, M., Mallick, B.: Gene selection: a Bayesian variable selection approach. Bioinformatics 19(1), 90–97 (2003). https://doi.org/10.1093/bioinformatics/19.1.90
Leisch, F., Dimitriadou, E.: mlbench: Machine Learning Benchmark Problems. R package version 2.1-1 (2010)
Google Scholar
Lindley, D.V.: A statistical paradox. Biometrika 44(1–2), 187–192 (1957). https://doi.org/10.1093/biomet/44.1-2.187
Madigan, D., Raftery, A.E.: Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Am. Stat. Assoc. 89(428), 1535–1546 (1994). https://doi.org/10.2307/2291017
Mallows, C.L.: Some comments on CP. Technometrics 15(4), 661–675 (1973). https://doi.org/10.2307/1267380
McDonald, G.C., Schwing, R.C.: Instabilities of regression estimates relating air pollution to mortality. Technometrics 15(3), 463–481 (1973). https://doi.org/10.2307/1266852
Miller, A.: Subset selection in regression. Chapman & Hall/CRC (2002). ISBN: 978-1-58488-171-1
Google Scholar
Mitchell, T.J., Beauchamp, J.J.: Bayesian variable selection in linear regression. J. Am. Stat. Assoc. 83(404), 1023–1032 (1988). https://doi.org/10.2307/2290129
Ntzoufras, I.: Bayesian modeling using WinBUGS. Wiley (2011). ISBN 978-0-470-14114-4. https://doi.org/10.1002/9780470434567
O’Hara, R.B., Sillanpää, M.J.: A review of Bayesian variable selection methods: what, how and which. Bayesian Anal. 4(1), 85–117 (2009). https://doi.org/10.1214/09-BA403
Ormerod, J.T., You, C., Mäller, S.: A variational Bayes approach to variable selection. Electron. J. Stat. 11(2), 3549–3594 (2017). https://doi.org/10.1214/17-EJS1332
Park, T., Casella, G.: The Bayesian Lasso. J. Am. Stat. Assoc. 103(482), 681–686 (2008). https://doi.org/10.1198/016214508000000337
Plummer, M.: JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In: Hornik, K., Leisch, F., Zeileis, A. (eds.) Proceedings of the Third International Workshop on Distributed Statistical Computing (DSC 2003), Vienna, Austria (2003)
Google Scholar
Raftery, A.E., Madigan, D., Hoeting, J.A.: Bayesian model averaging for linear regression models. J. Am. Stat. Assoc. 92(437), 179–191 (1997). https://doi.org/10.1080/01621459.1997.10473615
Robert, C.: On the Jeffreys-Lindley paradox. Philos. Sci. 81(2), 216–232 (2014). arXiv: 1303.5973
Article MathSciNet Google Scholar
SAS Institute Inc.: SAS/STAT(R) 9.2 User’s Guide, 2nd edn. SAS Institute Inc., Cary, NC (2008). ISBN: 978-1-60764-566-5
Google Scholar
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978). https://doi.org/10.1214/aos/1176344136
Scott, S.L., Varian, H.R.: Predicting the present with Bayesian structural time series. Int. J. Math. Model. Numer. Optim. 5(1–2), 4–23 (2014). https://doi.org/10.1504/IJMMNO.2014.059942
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 58(1), 267–288 (1996). https://doi.org/10.1111/j.1467-9868.2011.00771.x
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 67(2), 301–320 (2005). https://doi.org/10.1111/j.1467-9868.2005.00503.x
Zellner, A.: On Assessing Prior Distributions and Bayesian Regression Analysis with g-Prior Distributions. In: Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti. New York: Elsevier, pp. 233–243. (1986)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, London School of Economics and Political Science, Houghton Street, London, WC2A 2AE, United Kingdom
Haziq Jamil & Wicher Bergsma

Authors

Haziq Jamil
View author publications
You can also search for this author in PubMed Google Scholar
Wicher Bergsma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haziq Jamil .

Editor information

Editors and Affiliations

Fundamental and Applied Sciences Department and Centre for Systems Engineering (CSE), Institute of Autonomous System, Universiti Teknologi PETRONAS, Seri Iskandar, Perak Darul Ridzuan, Malaysia
Samsul Ariffin Abdul Karim

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jamil, H., Bergsma, W. (2021). Bayesian Variable Selection for Linear Models Using I-Priors. In: Abdul Karim, S.A. (eds) Theoretical, Modelling and Numerical Simulations Toward Industry 4.0. Studies in Systems, Decision and Control, vol 319. Springer, Singapore. https://doi.org/10.1007/978-981-15-8987-4_8

Download citation

DOI: https://doi.org/10.1007/978-981-15-8987-4_8
Published: 18 November 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8986-7
Online ISBN: 978-981-15-8987-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics