Abstract
This paper considers the sparse estimation problem of regression coefficients in the linear model. Note that the global–local shrinkage priors do not allow the regression coefficients to be truly estimated as zero, we propose three threshold rules and compare their contraction properties, and also tandem those rules with the popular horseshoe prior and the horseshoe+ prior that are normally regarded as global–local shrinkage priors. The hierarchical prior expressions for the horseshoe prior and the horseshoe+ prior are obtained, and the full conditional posterior distributions for all parameters for algorithm implementation are also given. Simulation studies indicate that the horseshoe/horseshoe+ prior with the threshold rules are both superior to the spike-slab models. Finally, a real data analysis demonstrates the effectiveness of variable selection of the proposed method.
Similar content being viewed by others
References
Arslan O (2010) An alternative multivariate skew laplace distribution: properties and estimation. Stat Papers 51(4):865–887
Bai R, Ghosh M (2018) High-dimensional multivariate posterior consistency under global-local shrinkage priors. J Multivar Anal 167(3):157–170
Bhadra A, Datta J, Polson NG, Willard B (2016) Default Bayesian analysis with global-local shrinkage priors. Biometrika 103(4):955–969
Bhadra A, Datta J, Polson NG, Willard B (2017) The horseshoe+ estimator of ultra-sparse signals. Bayesian Anal 12(4):1105–1131
Carvalho CM, Polson NG, Scott JG (2010) The horseshoe estimator for sparse signals. Biometrika 97(2):465–480
Efron B, Hastie TJ, Johnstone IM, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499
George EI, McCulloch RE (1993) Variable selection via gibbs sampling. J Am Stat Assoc 88(423):881–889
Ishwaran BHJ, Rao S (2005) Spike and slab variable selection: frequentist and Bayesian strategies. Ann Stat 33(2):730–773
Ishwaran H, Kogalur UB, Rao JS (2010) spikeslab: Prediction and variable selection using spike and slab regression. R J 2(2)
Mitchell TJ, Beauchamp JJ (1988) Bayesian variable selection in linear regression. J Am Stat Assoc 83(404):1023–1032
Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 103(482):681–686
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B Stat Methodol 58(1):267–288
Vavrek MJ (2011) Fossil: palaeoecological and palaeogeographical analysis tools. Palaeontol Electron 14(1):16
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J Royal Stat Soc Ser B Stat Methodol 67(2):301–320
Acknowledgements
We would like to thank the editor and the reviewers for their valuable comments and suggestions which have greatly improved this paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No potential conflict of interest was reported by the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supported by NNSF of China (11371051).
Appendix
Appendix
The more details about the Gibbs sampling of the horseshore+ prior model as follows:
From the joint posterior (2.7), it is easy to obtain
where \(\mu _n={\mathrm {\Lambda }_n}^{-1}X^\prime \varvec{y}\), \(\ \mathrm {\Lambda }_n=X^\prime X+\mathrm {\Lambda }_0\) and \(\mathrm {\Lambda }_0^{-1}=\tau ^2\) diag\(\{{\lambda _i}^2{\eta _i}^2\}\) with diag\(\{.\}\) being the diagonal matrix whose elements are \({\lambda _i}^2{\eta _i}^2(i=1,...,p)\).
And, for the parameter \(\tau\) we have
where \(\mathrm {\Sigma }_{\varvec{\beta }}=\sigma ^2{\mathrm {\Lambda }_0}^{-1}\).
Let \(\gamma =\frac{1}{\tau ^2}\). Together with the above formula we obtain
Let \({\widetilde{\mu }}^2=\sum _{i=1}^{p}\left( \frac{\beta _i}{\lambda _i\eta _i\sigma }\right) ^2\). Using the uniform distribution, we can employ the following sampling steps to generate \(\gamma\), i.e.,
and
where \(I(\cdot )\) is the indicator function.
For \(\eta _i\), we have
Let \(\vartheta _i=\frac{1}{{\eta _i}^2}\), then \(\eta _i={\vartheta _i}^{-\frac{1}{2}}\). Substituting \(\eta _i={\vartheta _i}^{-\frac{1}{2}}\) into the above formula, we have
Using the uniform distribution again, we obtain
and
A similar sampling method for \(\lambda _i\) is given as follows:
and
Note that
Thus, its full conditional distribution is also an inverse gamma distribution, i.e.,
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, Y., Yang, Y. & Wang, L. Sparse estimation of linear model via Bayesian method\(^*\). Comput Stat (2024). https://doi.org/10.1007/s00180-024-01474-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00180-024-01474-5