## Abstract

Bayesian methods provide a natural means for uncertainty quantification, that is, credible sets can be easily obtained from the posterior distribution. But is this uncertainty quantification valid in the sense that the posterior credible sets attain the nominal frequentist coverage probability? This paper investigates the frequentist validity of posterior uncertainty quantification based on a class of empirical priors in the sparse normal mean model. In particular, we show that our marginal posterior credible intervals achieve the nominal frequentist coverage probability under conditions slightly weaker than needed for selection consistency and a Bernstein–von Mises theorem for the full posterior, and numerical investigations suggest that our empirical Bayes method has superior frequentist coverage probability properties compared to other fully Bayes methods.

This is a preview of subscription content, access via your institution.

## Notes

In fact, it is not uncommon in seminar talks or less formal discussions to hear one motivate the construction of a new prior by saying that existing priors “don’t work” and/or the new prior “works better,” not that it more accurately reflects subjective prior beliefs, etc.

The expression for

*π*^{n}(*S*) given in Section 4.1 of Martin et al. (2017) has a typo, but the correct formula is given in the supplement at https://arxiv.org/abs/1406.7718.

## References

Arias-Castro, E. and Lounici, K. (2014). Estimation and variable selection with exponential weights.

*Electron. J. Stat.***8**, 1, 328–354.Barbieri, M. M. and Berger, J. O. (2004). Optimal predictive model selection.

*Ann. Statist.***32**, 3, 870–897.Belitser, E. (2017). On coverage and local radial rates of credible sets.

*Ann. Statist.***45**, 3, 1124–1151.Belitser, E. and Ghosal, S. (2019). Empirical Bayes oracle uncertainty quantification. Ann. Statist., to appear. http://www4.stat.ncsu.edu/ghoshal/papers/oracle_regression.pdf.

Belitser, E. and Nurushev, N. (2017). Needles and straw in a haystack: robust confidence for possibly sparse sequences. Unpublished manuscript. arXiv:1511.01803.

Bogdan, M., Chakrabarti, A., Frommlet, F. and Ghosh, J. K. (2011). Asymptotic Bayes-optimality under sparsity of some multiple testing procedures.

*Ann. Statist.***39**, 3, 1551–1579.Bogdan, M., Ghosh, J.K. and Tokdar, S.T. (2008).

*A comparison of the Benjamini-Hochberg procedure with some Bayesian rules for multiple testing*. IMS, Beachwood, Balakrishnan, N., Peña, E. and Silvapulle, M. (eds.),.Carvalho, C. M., Polson, N. G. and Scott, J. G. (2010). The horseshoe estimator for sparse signals.

*Biometrika***97**, 2, 465–480.Castillo, I., Schmidt-Hieber, J. and van der Vaart, A. (2015). Bayesian linear regression with sparse priors.

*Ann. Statist.***43**, 5, 1986–2018.Castillo, I. and Szabó, B. (2019). Spike and slab empirical Bayes sparse credible sets. Unpublished manuscript. arXiv:1808.07721.

Castillo, I. and van der Vaart, A. (2012). Needles and straw in a haystack: posterior concentration for possibly sparse sequences.

*Ann. Statist.***40**, 4, 2069–2101.Datta, J. and Ghosh, J. K. (2013). Asymptotic properties of Bayes risk for the horseshoe prior.

*Bayesian Anal.***8**, 1, 111–131.Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over

*l*_{p}-balls for*l*_{q}-error.*Probab. Theory Related Fields***99**, 2, 277–303.Ghosh, J. K., Delampady, M. and Samanta, T. (2006).

*An introduction to bayesian analysis*. Springer, New York.Ghosh, P. and Chakrabarti, A. (2015). Posterior concentration properties of a general class of shrinkage estimators around nearly black vectors. Unpublished manuscript. arXiv:1412.8161.

Grünwald, P. and Mehta, N. (2017). Faster rates for general unbounded loss functions: from ERM to generalized Bayes. Unpublished manuscript. arXiv:1605.00252.

Grünwald, P. and van Ommen, T. (2017). Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it.

*Bayesian Anal.***12**, 4, 1069–1103.Holmes, C. C. and Walker, S. G. (2017). Assigning a value to a power likelihood in a general Bayesian model.

*Biometrika***104**, 2, 497–503.Jiang, W. and Zhang, C. -H. (2009). General maximum likelihood empirical Bayes estimation of normal means.

*Ann. Statist.***37**, 4, 1647–1684.Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: empirical Bayes estimates of possibly sparse sequences.

*Ann. Statist.***32**, 4, 1594–1649.Li, K. -C. (1989). Honest confidence regions for nonparametric regression.

*Ann. Statist.***17**, 3, 1001–1008.Liu, C., Yang, Y., Bondell, H. and Martin, R. (2018). Bayesian inference in high-dimensional linear models using an empirical correlation-adaptive prior. Unpublished manuscript. arXiv:1810.00739.

Martin, R. (2017). Invited comment on the article by van der Pas, Szabó, and van der Vaart.

*Bayesian Anal.***12**, 4, 1254–1258.Martin, R. (2018). Empirical priors and posterior concentration rates for a monotone density. Sankhya A, to appear. arXiv:1706.08567.

Martin, R., Mess, R. and Walker, S. G. (2017). Empirical Bayes posterior concentration in sparse high-dimensional linear models.

*Bernoulli***23**, 3, 1822–1847.Martin, R. and Tang, Y. (2019). Empirical priors for prediction in sparse high-dimensional linear regression. arXiv:1903.00961.

Martin, R. and Walker, S. G. (2014). Asymptotically minimax empirical Bayes estimation of a sparse normal mean vector.

*Electron. J. Stat.***8**, 2, 2188–2206.Martin, R. and Walker, S. G. (2019). Data-dependent priors and their posterior concentration rates.

*Electron. J. Stat.***13**, 2, 3049–3081.Ning, B. and Ghosal, S. (2018). Bayesian linear regression for multivariate responses under group sparsity. Unpublished manuscript. arXiv:1807.03439.

Nurushev, N. and Belitser, E. (2019). General framework for projection structures. Unpublished manuscript. arXiv:1904.01003.

Salomond, J. -B. (2014). Concentration rate and consistency of the posterior distribution for selected priors under monotonicity constraints.

*Electron. J. Stat.***8**, 1, 1380–1404.Syring, N. and Martin, R. (2019). Calibrating general posterior credible regions.

*Biometrika***106**, 2, 479–486.Szabó, B., van der Vaart, A. W. and van Zanten, J. H. (2015). Frequentist coverage of adaptive nonparametric Bayesian credible sets.

*Ann. Statist.***43**, 4, 1391–1428.van der Pas, S., Scott, J., Chakraborty, A. and Bhattacharya, A. (2016). horseshoe: Implementation of the Horseshoe Prior. R package version 0.1.0.

van der Pas, S., Szabó, B. and van der Vaart, A. (2017a). Adaptive posterior contraction rates for the horseshoe.

*Electron. J. Stat.***11**, 2, 3196–3225.van der Pas, S., Szabó, B. and van der Vaart, A. (2017b). Uncertainty quantification for the horseshoe (with discussion).

*Bayesian Anal.***12**, 4, 1221–1274. With a rejoinder by the authors.van der Pas, S. L., Kleijn, B. J. K. and van der Vaart, A. W. (2014). The horseshoe estimator: posterior concentration around nearly black vectors.

*Electron. J. Stat.***8**, 2, 2585–2618.

## Acknowledgments

The authors thank the editors of the special issue of *Sankhya A* dedicated to Jayanta K. Ghosh for the invitation to contribute, and the anonymous reviewers for their helpful suggestions that improved both our results and presentation. This work is partially supported by the National Science Foundation, DMS–1737933.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendix : Proof of Theorem 2

### Appendix : Proof of Theorem 2

The proof strategy here closely follows that of Theorem 5^{′} in the supplement to Martin et al. (2017) presented in the most recent arXiv version (arXiv:1406.7718). To fix notation, let *S*^{⋆} be the true configuration of size *s*^{⋆} = |*S*^{⋆}|, and let \(S^{\dagger } \subseteq S^{\star }\) be the set of all *i* such that \(|\theta _{i}^{\star }| \geq \rho _{n}\), where *ρ*_{n} is as in Eq. 2.6, and write *s*^{†} = |*S*^{†}|.

Based on Theorem 2 in Martin et al. (2017), we can restrict to configurations *S* such that |*S*| ≤ *C**s*^{⋆}, where *s*^{⋆} = |*S*^{⋆}| and *C* is a large constant. Take such an *S* that also satisfies \(S \not \supseteq S^{\dagger }\). Then *π*^{n}(*S*) can be bounded as follows:

where *z* = (1 + *α**τ*^{− 1})^{− 1/2} < 1. A key observation is that

and the latter two terms are independent since they depend on disjoint sets of *Y*_{i}’s. Therefore, using this independence and the familiar central and non-central chi-square moment generating functions, we get

By definition of *S*^{†}, and the fact that 1 + *α* > 1, the above expectation can be upper-bounded by

Putting the pieces together we have

We want to sum this over all \(S \not \supseteq S^{\dagger }\) but, since it only involves size of *S*, we only need to sum over sizes. Indeed, after plugging in the definition of *π*(*S*) we get

For the binomial coefficient ratio we have the following simplification and bound:

Next, to bound the double-sum, split it into two parts:

We need to show that both parts on the right-hand side above vanish as \(n \to \infty \). For the first double-sum we have

Since *M* > 1 + *a*_{1}, the inner sum is *O*(1) and the outer sum—because there is a common *n*^{−(M−a1− 1)} factor—is *o*(1) as \(n \to \infty \). Similarly, for the second double-sum we have

The inner sum is *O*(1) and, since *a*_{2} < *a*_{1}, the outer sum is upper-bounded by *O*(*s*^{⋆}*n*^{−a2}) which goes to 0 by assumption. Both terms in the double-sum above vanish as \(n \to \infty \), thus proving the claim.

## Rights and permissions

## About this article

### Cite this article

Martin, R., Ning, B. Empirical Priors and Coverage of Posterior Credible Sets in a Sparse Normal Mean Model.
*Sankhya A* **82**, 477–498 (2020). https://doi.org/10.1007/s13171-019-00189-w

Received:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s13171-019-00189-w

### Keywords and phrases.

- Bayesian inference
- Bernstein–von Mises theorem
- Concentration rate
- High-dimensional model
- Uncertainty quantification

### AMS (2000) subject classification.

- Primary 62C12
- 62F12
- Secondary 62E20