Skip to main content

Global-Local Mixtures: A Unifying Framework

Abstract

Global-local mixtures, including Gaussian scale mixtures, have gained prominence in recent times, both as a sparsity inducing prior in pn problems as well as default priors for non-linear many-to-one functionals of high-dimensional parameters. Here we propose a unifying framework for global-local scale mixtures using the Cauchy-Schlömilch and Liouville integral transformation identities, and use the framework to build a new Bayesian sparse signal recovery method. This new method is a Bayesian counterpart of the \(\sqrt {\text {Lasso}}\) (Belloni et al., Biometrika 98, 4, 791–806, 2011) that adapts to unknown error variance. Our framework also characterizes well-known scale mixture distributions including the Laplace density used in Bayesian Lasso, logit and quantile via a single integral identity. Finally, we derive a few convolutions that commonly arise in Bayesian inference and posit a conjecture concerning bridge and uniform correlation mixtures.

This is a preview of subscription content, access via your institution.

Figure 1

References

  • Amdeberhan, T., Glasser, M.L., Jones, M.C., Moll, V.H., Posey, R. and Varela, D. (2010). The Cauchy-Schlomilch transformation. arXiv:1004.2445 [math].

  • Andrews, D. and Mallows, C. (1974). Scale mixtures of normal distributions. Journal of the Royal Statistical Society Series B: Statistical Methodology 36, 1, 99–102. https://doi.org/10.2307/2984774.

    MathSciNet  MATH  Google Scholar 

  • Baker, R. (2008). Probabilistic applications of the Schlömilch transformation. Communications in Statistics – Theory and Methods 37, 14, 2162–2176.

    MATH  Google Scholar 

  • Barndorff-Nielsen, O., Kent, J. and Sørensen, M (1982). Normal variance-mean mixtures and z distributions. International Statistical Review 50, 145–159.

    MathSciNet  MATH  Google Scholar 

  • Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98, 4, 791–806.

    MathSciNet  MATH  Google Scholar 

  • Bhadra, A., Datta, J., Li, Y., Polson, N.G. and Willard, B. (2016a). Prediction risk for the horseshoe regression. arXiv:160504796.

  • Bhadra, A., Datta, J., Polson, N.G. and Willard, B. (2016b). Default Bayesian analysis with global-local shrinkage priors. Biometrika 103, 4, 955–969.

    MathSciNet  MATH  Google Scholar 

  • Bhadra, A., Datta, J., Polson, N.G., Willard, B. et al. (2017). The horseshoe+ estimator of ultra-sparse signals. Bayesian Anal. 12, 4, 1105–1131.

    MathSciNet  MATH  Google Scholar 

  • Bogdan, M., Chakrabarti, A., Frommlet, F. and Ghosh, J.K. (2011). Asymptotic Bayes-optimality under sparsity of some multiple testing procedures. Ann. Stat. 39, 3, 1551–1579.

    MathSciNet  MATH  Google Scholar 

  • Boros, G., Moll, V.H. and Foncannon, J. (2006). Irresistible integrals: symbolics, analysis and experiments in the evaluation of integrals. The Mathematical Intelligencer28, 3, 65–68.

    Google Scholar 

  • Bryson, M.C. and Johnson, M.E. (1982). Constructing and simulating multivariate distributions using Khintchine’s theorem. J. Stat. Comput. Simul. 16, 2, 129–137.

    MATH  Google Scholar 

  • Carvalho, C.M., Polson, N.G. and Scott, J.G. (2010). The horseshoe estimator for sparse signals. Biometrika 97, 465–480.

    MathSciNet  MATH  Google Scholar 

  • Chatterjee, A. and Lahiri, S.N. (2011). Bootstrapping lasso estimators. Journal of the American Statistical Association 106, 494, 608–625. https://doi.org/10.1198/jasa.2011.tm10159.

    MathSciNet  MATH  Google Scholar 

  • Chaubey, Y.P., Mudholkar, G.S. and Jones, M. (2010). Reciprocal symmetry, unimodality and Khintchine’s theorem, 466, p. 2079–2096.

  • Giraud, C. (2014). Introduction to high-dimensional statistics, 138. CRC Press, Boca Raton.

    Google Scholar 

  • Giraud, C., Huet, S., Verzelen, N. et al. (2012). High-dimensional regression with unknown variance. Stat. Sci. 27, 4, 500–518.

    MathSciNet  MATH  Google Scholar 

  • Gneiting, T. (1997). Normal scale mixtures and dual probability densities. J. Stat. Comput. Simul. 59, 4, 375–384.

    MATH  Google Scholar 

  • Hans, C. (2011). Comment on Article by Polson and Scott. Bayesian Anal. 6, 1, 37–41.

    MathSciNet  MATH  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J. and Tibshirani, R. (2009). The elements of statistical learning, 2. Springer, Berlin.

    MATH  Google Scholar 

  • Jones, M. (2002a). On khintchine’s theorem and its place in random variate generation. Am. Stat. 56, 4, 304–307.

    MathSciNet  MATH  Google Scholar 

  • Jones, M. (2002b). On Khintchine’s theorem and its place in random variate generation. Am. Stat. 16, 4, 304–307.

    MathSciNet  MATH  Google Scholar 

  • Jones, M.C. (2014). Generating distributions by transformation of scale. Statist Sinica 24, 749–772.

    MathSciNet  MATH  Google Scholar 

  • Lévy, P. (1940). Sur certains processus stochastiques homogènes. Compositio mathematica 7, 283–339.

    MATH  Google Scholar 

  • Mudholkar, G.S. and Wang, H. (2007). Ig-symmetry and r-symmetry: interrelations and applications to the inverse gaussian theory. Journal of Statistical Planning and Inference 137, 11, 3655–3671.

    MathSciNet  MATH  Google Scholar 

  • Palmer, J.A., Kreutz-Delgado, K. and Makeig, S. (2011). AMICA: An adaptive mixture of independent component analyzers with shared components. Tech. rep. Swartz Center for Computational Neuroscience, San Diego, CA.

  • Park, T. and Casella, G. (2008). The bayesian lasso. Journal of the American Statistical Association 103, 482, 681–686. http://amstat.tandfonline.com/doi/abs/10.1198/016214508000000337.

    MathSciNet  MATH  Google Scholar 

  • Pillai, N.S. and Meng, X.L. (2016). An unexpected encounter with Cauchy and Lévy. Annals of Statistics (to appear).

  • Polson, N.G. and Scott, J.G. (2010a). Large-scale simultaneous testing with hypergeometric inverted-beta priors. arXiv:10105223.

  • Polson, N.G. and Scott, J.G. (2010b). Shrink globally, act locally: Sparse Bayesian regularization and prediction. Bayesian Statistics 9, 501–538.

    Google Scholar 

  • Polson, N.G. and Scott, J.G. (2012). On the half-cauchy prior for a global scale parameter. Bayesian Anal. 7, 4, 887–902. doi: 10.1214/12-BA730.

    MathSciNet  MATH  Article  Google Scholar 

  • Polson, N.G. and Scott, J.G. (2013). Data augmentation for non-Gaussian regression models using variance-mean mixtures. Biometrika 100, 2, 459–471. https://doi.org/10.1093/biomet/ass081.

    MathSciNet  MATH  Google Scholar 

  • Polson, N.G. and Scott, S.L. (2011). Data augmentation for support vector machines. Bayesian Anal. 6, 1, 1–23.

    MathSciNet  MATH  Google Scholar 

  • Polson, N.G., Scott, J.G. and Windle, J. (2013). Bayesian inference for logistic models using Pólya–Gamma latent variables. J. Am. Stat. Assoc. 108, 504, 1339–1349.

    MATH  Google Scholar 

  • Polson, N.G., Scott, J.G. and Windle, J. (2014). The Bayesian bridge. Journal of the Royal Statistical Society Series B: Statistical Methodology 76, 4, 713–733. https://doi.org/10.1111/rssb.12042.

    MathSciNet  Google Scholar 

  • Polson, N.G., Scott, J.G. and Willard, B.T. (2015). Proximal algorithms in statistics and machine learning. Stat. Sci. 30, 4, 559–581.

    MathSciNet  MATH  Google Scholar 

  • Seshadri, V. (2004). Halphen’s laws. John Wiley and Sons Inc., Hoboken,.

  • Sun, T. and Zhang, C.H. (2012). Scaled sparse linear regression. Biometrika 99, 4, 879–898.

    MathSciNet  MATH  Google Scholar 

  • Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society Series B 58, 267–288.

    MathSciNet  MATH  Google Scholar 

  • Zhang, K., Brown, L.D., George, E. and Zhao, L. (2014). Uniform Correlation Mixture of Bivariate Normal Distributions and Hypercubically Contoured Densities That Are Marginally Normal. Am. Stat. 68, 3, 183–187.

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The horseshoe prior and the framework of global-local shrinkage priors were introduced in 2010 by a series of papers (Carvalho et al. 2010; Polson and Scott, 2010a), and around the same time, the framework of Bayes oracle for testing was introduced by Bogdan et al. (2011). The subject of Bayesian shrinkage, model selection and multiple testing almost immediately had an explosive development that is still going on. Besides playing a vital role in shaping the early history and the subsequent course of Bayesian theory and methodology, Professor Jayanta K. Ghosh contributed seminal theoretical results in the early history of Bayesian sparse signal recovery. We have written this paper to honor his memory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jyotishka Datta.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Bhadra and Polson are supported by Grant no. DMS-1613063 by the US National Science Foundation.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bhadra, A., Datta, J., Polson, N.G. et al. Global-Local Mixtures: A Unifying Framework. Sankhya A 82, 426–447 (2020). https://doi.org/10.1007/s13171-019-00191-2

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-019-00191-2

Keywords

  • Bayes regularization
  • \(\sqrt {\text {Lasso}}\)
  • Convolution
  • Lasso
  • Logistic
  • Quantile

AMS (2000) subject classification

  • Primary 62F15
  • Secondary 62J07, 62C10