Skip to main content
Log in

Bayesian variable selection for correlated covariates via colored cliques

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

We propose a Bayesian method to select groups of correlated explanatory variables in a linear regression framework. We do this by introducing in the prior distribution assigned to the regression coefficients a random matrix \(G\) that encodes the group structure. The groups can thus be inferred by sampling from the posterior distribution of \(G\). We then give a graph-theoretic interpretation of this random matrix \(G\) as the adjacency matrix of cliques. We discuss the extension of the groups from cliques to more general random graphs, so that the proposed approach can be viewed as a method to find networks of correlated covariates that are associated with the response.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. We could instead set \(G_{ii}=1\). Since one can easily go from one form to the other by a suitable redefinition of the prior distributions, we choose the definitions which we find easier to implement.

  2. To show this, one has to consider a function \(\phi (L, x)\) in (8) which is either finite or has a finite limit (by simultaneously taking \(L\) very small, if the case) when \(x \rightarrow 1\).

  3. Henceforth, by network we mean a connected component of a graph that is not a clique.

References

  • Besag, J., Green, P.J., Higdon, D., Mengersen, K.: Bayesian computation and stochastic systems. Stat. Sci. 10, 3–66 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Bondell, H.D., Reich, B.J.: Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics 64, 115–123 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Bornn, L., Gottardo, R., Doucet, A.: Grouping priors and the Bayesian elastic net. Tech. Rep., Department of Statistics, University of British Columbia, arXiv:1001.4083v1 [stat.ME] (2010)

  • Box, G.E.P., Tiao, G.C.: Bayesian Inference in Statistical Analysis. Addison-Wesley, Reading (1973)

  • Chatterjee, S., Diaconis, P.: Estimating and understanding exponential random graph models. Tech. Rep., Department of Statistics, Stanford University, arXiv:1102.2650 [math.PR] (2011)

  • Chipman, H., George, E.I., McCulloch, R.E.: The practical implementation of Bayesian model selection. In: Lahiri, P. (ed.) Model Selection, pp. 65–116. Institute of Mathematical Statistics, Beachwood (2001)

    Chapter  Google Scholar 

  • Clyde, M.A., Ghosh, J., Littman, M.L.: Bayesian adaptive sampling for variable selection and model averaging. J. Comput. Graph. Stat. 20, 80–101 (2011)

    Article  MathSciNet  Google Scholar 

  • Drummond, M.J., McCarthy, J.J., Sinha, M., Spratt, H.M., Volpi, E., Esser, K.A., Rasmussen, B.B.: Aging and microRNA expression in human skeletal muscle: a microarray and bioinformatics analysis. Physiol. Genomics 43, 595–603 (2011)

    Article  Google Scholar 

  • Frank, I.E., Friedman, J.H.: A statistical view of some chemometrics regression tools (with discussions). Technometrics 35, 109–148 (1993)

    Article  MATH  Google Scholar 

  • Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010)

    Google Scholar 

  • Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall, London (2003)

  • George, E.I.: Dilution priors: compensating for model space redundancy. In: Borrowing Strength: Theory Powering Applications—a Festschrift for Lawrence D. Brown, Institute of Mathematical, Statistics, pp 158–165 (2010)

  • George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)

    Article  Google Scholar 

  • George, E.I., McCulloch, R.E.: Approaches for Bayesian variable selection. Stat. Sin. 7, 339–373 (1997)

    MATH  Google Scholar 

  • Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55–82 (1970)

    Article  MATH  Google Scholar 

  • Kass, R.E., Wasserman, L.: A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Am. Stat. Assoc. 90, 928–934 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Li, Q., Lin, N.: The Bayesian elastic net. Bayesian Anal. 5, 151–170 (2010)

    Article  MathSciNet  Google Scholar 

  • Liang, F., Paulo, R., Molina, G., Clyde, M.A., Berger, J.O.: Mixtures of \(g\) priors for Bayesian variable selection. J. Am. Stat. Assoc. 103, 410–423 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Monni, S., Li, H.: Bayesian methods for network-structured genomics data. In: Chen, M., Dey, D., Müller, P., Sun, D., Ye, K. (eds.) Frontiers of Statistical Decision Making and Bayesian Analysis, pp. 303–315. Springer, New York (2010)

    Google Scholar 

  • Smith, M., Kohn, R.: Nonparametric regression using Bayesian variable selection. J. Econom. 75, 317–343 (1996)

    Article  MATH  Google Scholar 

  • Tarjan, R.: A note on finding the bridges of a graph. Inf. Process. Lett. 2, 160–161 (1974)

    Article  MATH  MathSciNet  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  • Tutz, G., Ulbricht, J.: Penalized regression with correlation-based penalty. Stat. Comput. 19, 239–253 (2009)

    Article  MathSciNet  Google Scholar 

  • Zellner, A.: On assessing prior distributions and Bayesian regression analysis with \(g\)-prior distributions. In: Goel, P.K., Zellner, A. (eds.) Bayesian Inference and Decision Techniques, pp. 233–243. North Holland, Amsterdam (1986)

  • Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67, 301–320 (2005)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

This research was carried out while the author was at the Department of Public Health of the Weill Cornell Medical College. It was supported by CTSC Grant UL1-RR024996. The author would like to thank two reviewers for their comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefano Monni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Monni, S. Bayesian variable selection for correlated covariates via colored cliques. AStA Adv Stat Anal 98, 143–163 (2014). https://doi.org/10.1007/s10182-013-0218-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-013-0218-9

Keywords

Navigation