Abstract
Separable penalties for sparse vector recovery are plentiful throughout statistical methodology and theory. Here, we confine attention to the problem of estimating sparse high-dimensional normal means. Separable penalized likelihood estimators are known to have a Bayesian interpretation as posterior modes under independent product priors. Such estimators can achieve rate-minimax performance when the correct level of sparsity is known. A fully Bayes approach, on the other hand, mixes the product priors over a shared complexity parameter. These constructions can yield a self-adaptive posterior that achieves rate-minimax performance when the sparsity level is unknown. Such optimality has also been established for posterior mean functionals. However, less is known about posterior modes in these setups. Ultimately, the mixing priors render the coordinates dependent through a penalty that is no longer separable. By tying the coordinates together, the hope is to gain adaptivity and achieve automatic hyperparameter tuning. Here, we study two examples of fully Bayes penalties: the fully Bayes LASSO and the fully Bayes Spike-and-Slab LASSO of Ročková and George (The Spike-and-Slab LASSO, Submitted). We discuss discrepancies and highlight the benefits of the two-group prior variant. We develop an Appell function apparatus for coping with adaptive selection thresholds. We show that the fully Bayes treatment of a complexity parameter is tantamount to oracle hyperparameter choice for sparse normal mean estimation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Armero, C., Bayarri, M.: Prior assessments in prediction in queues. The Stat. 45, 139–153 (1994)
Bondell, H., Reich, B.: Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics 64, 115–123 (2008)
Brown, L.: Admissible estimators, recurrent diffusions, and insoluble boundary value problems. Ann. Math. Stat. 42, 855–903 (1971)
Castillo, I., Schmidt-Hieber, J., van der Vaart, A.: Bayesian linear regression with sparse priors. Ann. Stat. 43, 1986–2018 (2015)
Castillo, I., van der Vaart, A.: Needles and straw in a haystack: posterior concentration for possibly sparse sequences. Ann. Stat. 40, 2069–2101 (2012)
Donoho, D., Johnstone, I.M.: Ideal spatial adaptation by wavelet shrinkage. Biometrika 81, 425–455 (1994)
Donoho, D., Johnstone, I.M., Hoch, J.C., Stern, A.S.: Maximum entropy and the nearly black object. J. R. Stat. Soc. B 54, 41–81 (1992)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Fan, Y., Lv, J.: Asymptotic properties for combined 11 and concave regularization. Biometrika 101, 67–70 (2014)
Friedman, J.: Fast sparse regression and classification. Technical Report, Department of Statistics, Stanford University (2008)
George, E.I.: Combining minimax shrinkage estimators. J. Am. Stat. Assoc. 81, 437–445 (1968a)
George, E.I.: Minimax multiple shrinkage estimation. Ann. Stat. 14, 188–205 (1968b)
Gradshteyn, I., Ryzhik, E.: Table of Integrals Series and Products. Academic, New York (2000)
Griffin, J.E., Brown, P.J.: Bayesian hyper-LASSOS with non-convex penalization. Aust. N. Z. J. Stat. 53, 423–442 (2012)
Ismail, M., Pitman, J.: Algebraic evaluations of some Euler integrals, duplication formulae for Appell’s hypergeometric function f 1, and Brownian variations. Can. J. Math. 52, 961–981 (2000)
Johnstone, I.M., Silverman, B.W.: Needles and straw in haystacks: empirical Bayes estimates of possibly sparse sequences. Ann. Stat. 32, 1594–1649 (2004)
Karp, D., Sitnik, S.M.: Inequalities and monotonicity of ratios for generalized hypergeometric function. J. Approx. Theory 161, 337–352 (2009)
Meier, L., Van de Geer, S., Bühlmann, P.: The group LASSO for logistic regression. J. R. Stat. Soc. B 70, 53–71 (2008)
Park, T., Casella, G.: The Bayesian LASSO. J. Am. Stat. Assoc. 103, 681–686 (2008)
Polson, N., Scott, J.: Shrink globally, act locally: sparse Bayesian regularization and prediction. Bayesian Stat. 9, 501–539 (2010)
Ročková, V.: Bayesian estimation of sparse signals with a continuous spike-and-slab prior. In revision Annals of Statistics (2015)
Ročková, V., George, E.: EMVS: The EM approach to Bayesian variable selection. J. Am. Stat. Assoc. 109, 828–846 (2014)
Ročková, V., George, E.: Fast Bayesian factor analysis via automatic rotations to sparsity. J. Am. Stat. Assoc., JASA (2015a, accepted for publication)
Ročková, V., George, E.: The Spike-and-Slab LASSO, JASA (2015b, Submitted)
Stein, C.: Estimation of the mean of a multivariate normal distribution. In: Hajek, J. (ed.) Prague Symposium on Asymptotic Statistics. Univerzita Karlova, Prague, Czech republic (1974)
Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. B 58, 267–288 (1994)
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused LASSO. J. R. Stat. Soc. B 67, 91–108 (2005)
Wang, Z., Liu, H., Zhang, T.: Optimal computational and statistical rates of convergence for sparse nonconvex learning problems. Ann. Stat. 42, 2164–2201 (2014)
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)
Zheng, Z., Fan, Y., Lv, J.: High dimensional thresholded regression and shrinkage effect. J. R. Stat. Soc. B 76, 627–649 (2014)
Zou, H.: The adaptive LASSO and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67, 301–320 (2005)
Acknowledgements
This work was supported by NSF grant DMS-1406563 and AHRQ grant R21-HS021854.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ročková, V., George, E.I. (2016). Bayesian Penalty Mixing: The Case of a Non-separable Penalty. In: Frigessi, A., Bühlmann, P., Glad, I., Langaas, M., Richardson, S., Vannucci, M. (eds) Statistical Analysis for High-Dimensional Data. Abel Symposia, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-319-27099-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-27099-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27097-5
Online ISBN: 978-3-319-27099-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)