Skip to main content
Log in

Acceleration of the stochastic search variable selection via componentwise Gibbs sampling

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

The stochastic search variable selection proposed by George and McCulloch (J Am Stat Assoc 88:881–889, 1993) is one of the most popular variable selection methods for linear regression models. Many efforts have been proposed in the literature to improve its computational efficiency. However, most of these efforts change its original Bayesian formulation, thus the comparisons are not fair. This work focuses on how to improve the computational efficiency of the stochastic search variable selection, but remains its original Bayesian formulation unchanged. The improvement is achieved by developing a new Gibbs sampling scheme different from that of George and McCulloch (J Am Stat Assoc 88:881–889, 1993). A remarkable feature of the proposed Gibbs sampling scheme is that, it samples the regression coefficients from their posterior distributions in a componentwise manner, so that the expensive computation of the inverse of the information matrix, which is involved in the algorithm of George and McCulloch (J Am Stat Assoc 88:881–889, 1993), can be avoided. Moreover, since the original Bayesian formulation remains unchanged, the stochastic search variable selection using the proposed Gibbs sampling scheme shall be as efficient as that of George and McCulloch (J Am Stat Assoc 88:881–889, 1993) in terms of assigning large probabilities to those promising models. Some numerical results support these findings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Beattie SD, Fong DKH, Lin DKJ (2002) A two-stage Bayesian model selection strategy for supersaturated designs. Technometrics 44:55–63

    Article  MathSciNet  Google Scholar 

  • Casella G, George EI (1992) Explaining the Gibbs sampler. Am Stat 91:883–904

    MathSciNet  Google Scholar 

  • Chen RB, Chu CH, Lai TH, Wu YN (2011) Stochastic matching pursuit for Bayesian variable selection. Stat Comput 21:247–259

    Article  MathSciNet  MATH  Google Scholar 

  • Chen RB, Weng JZ, Chu CH (2013) Screening procedure for supersaturated designs using a Bayesian variable selection method. Qual Reliab Eng Int 29:89–101

    Article  Google Scholar 

  • Chipman H (1998) Fast model search for designed experiments with complex aliasing. Quality improvement through statistical methods. Birkhäuser, Boston

    Google Scholar 

  • Chipman H, Hamada H, Wu CFJ (1997) A Bayesian variable selection approach for analyzing designed experiments with complex aliasing. Technometrics 39:372–381

    Article  MATH  Google Scholar 

  • Diebolt J, Robert C (1994) Estimation of finite mixture distribution trough Bayesian sampling. J R Stat Soc Ser B 56:363–375

    MATH  Google Scholar 

  • Draper N, Smith H (1981) Applied regression analysis, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Fang KT, Li R, Sudjianto A (2006) Design and modeling for computer experiments. Chapman & Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • George EI, McCulloch RE (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88:881–889

    Article  Google Scholar 

  • George EI, McCulloch RE (1997) Approaches for Bayesian variable selection. Stat Sin 7:339–373

    MATH  Google Scholar 

  • Georgiou SD (2014) Supersaturated designs: a review of their construction and analysis. J Stat Plann Inference 144:92–109

    Article  MathSciNet  MATH  Google Scholar 

  • Geweke J (1996) Variable selection and model comparison in regression. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds) Bayesian statistics. Oxford Press, Oxford

    Google Scholar 

  • Huang HZ, Yang JY, Liu MQ (2014) Functionally induced priors for componentwise Gibbs sampler in the analysis of supersaturated designs. Comput Stat Data Anal 72:1–12

    Article  MathSciNet  Google Scholar 

  • Li R, Lin DKJ (2003) Analysis methods for supersaturated design: some comparisons. J Data Sci 1:249–260

    Google Scholar 

  • Lin DKJ (1993) A new class of supersaturated designs. Technometrics 35:28–31

    Article  Google Scholar 

  • Liu Y, Liu MQ (2011) Construction of optimal supersaturated design with large number of levels. J Stat Plan Inference 141:2035–2043

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Y, Liu MQ (2012) Construction of equidistant and weak equidistant supersaturated designs. Metrika 75:33–53

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Y, Liu MQ (2013) Construction of supersaturated design with large number of factors by the complementary design method. Acta Math Appl Sin 29:253–262

    Article  MathSciNet  MATH  Google Scholar 

  • Phoa FKH, Pan YH, Xu H (2009) Analysis of supersaturated designs via the Dantzig selector. J Stat Plan Inference 139:2362–2372

    Article  MathSciNet  MATH  Google Scholar 

  • Shao J (2003) Mathematical statistics, 2nd edn. Springer, New York

    Book  MATH  Google Scholar 

  • Sun FS, Lin DKJ, Liu MQ (2011) On construction of optimal mixed-level supersaturated designs. Ann Stat 39:1310–1333

    Article  MathSciNet  MATH  Google Scholar 

  • Tanner MA, Wong WH (1987) The calculation of posterior distribution by data augmentation (with discussion). J Am Stat Assoc 82:528–550

    Article  MathSciNet  MATH  Google Scholar 

  • Thompson MB (2010) A comparison of methods for computing autocorrelation time. Technical Report No. 1007, Department of Statistics, University of Toronto

  • Westfall PH, Young SS, Lin DKJ (1998) Forward selection error control in the analysis of supersaturated designs. Stat Sin 8:101–117

    MATH  Google Scholar 

  • Wu CFJ, Hamada M (2009) Experiments: planning, analysis, and optimization, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Yin YH, Zhang QZ, Liu MQ (2013) A two-stage variable selection strategy for supersaturated designs with multiple responses. Front Math China 8:717–730

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang QZ, Zhang RC, Liu MQ (2007) A method for screening active effects in supersaturated designs. J Stat Plan Inference 137:235–248

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors thank Editor Professor Norbert Henze, and two anonymous referees for their valuable comments and suggestions. This work was supported by the National Natural Science Foundation of China (Grant Nos. 11271205, 11401321 and 11431006), the Specialized Research Fund for the Doctoral Program of Higher Education (Grant No. 20130031110002), the “131” Talents Program of Tianjin and Project 613319. The first two authors contributed equally to this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min-Qian Liu.

Appendix

Appendix

Proof of Theorem 1

The joint probability density function (pdf) of all the variables can be expressed as

$$\begin{aligned}{}[\varvec{\beta }, \sigma ^2,\varvec{\gamma }, {\mathbf {Y}}]= & {} [{\mathbf {Y}}|\varvec{\beta }, \sigma ^2,\varvec{\gamma }][\varvec{\beta }|\sigma ^2,\varvec{\gamma }][\sigma ^2,\varvec{\gamma }]\nonumber \\= & {} [{\mathbf {Y}}|\varvec{\beta }, \sigma ^2][\varvec{\beta }| \varvec{\gamma }][\varvec{\gamma }][\sigma ^2], \end{aligned}$$
(9)

where the last equality follows from the assumptions (1)–(4).

For \(i=1,\ldots ,p\), the full conditional pdf of the pair \((\beta _i,\gamma _i)\) can be expressed as

$$\begin{aligned}{}[\beta _i,\gamma _i|\varvec{\beta }_{(-i)},\varvec{\gamma }_{(-i)},\sigma ^2,{\mathbf {Y}}]= [\beta _i|\gamma _i,\varvec{\beta }_{(-i)},\varvec{\gamma }_{(-i)},\sigma ^2,{\mathbf {Y}}][\gamma _i|\varvec{\beta }_{(-i)},\varvec{\gamma }_{(-i)},\sigma ^2,{\mathbf {Y}}],\nonumber \\ \end{aligned}$$
(10)

where the notation \((-i)\) means all the components except the i-th one.

Next, we derive the closed forms for the two conditional pdf’s on the right-hand side of (10). Notice that the prior distribution of \(\gamma _i\) follows a Bernoulli distribution, by Tanner and Wong (1987), given other variables the conditional distribution of \(\gamma _i\) follows a Bernoulli distribution as well. Thus

$$\begin{aligned} P(\gamma _i=1|\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2, {\mathbf {Y}})= & {} \frac{[{\mathbf {Y}}|\gamma _i=1,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2][\gamma _i=1,\varvec{\beta }_{(-i)},\varvec{\gamma }_{(-i)}, \sigma ^2]}{[{\mathbf {Y}}|\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2]} \\= & {} \frac{[{\mathbf {Y}}|\gamma _i=1,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2][\gamma _i=1]}{[{\mathbf {Y}}|\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2]}. \end{aligned}$$

Similarly,

$$\begin{aligned} P(\gamma _i=0|\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2, {\mathbf {Y}}) = \frac{[{\mathbf {Y}}|\gamma _i=0,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2][\gamma _i=0]}{[{\mathbf {Y}}|\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2]}. \end{aligned}$$

Since \( P(\gamma _i=1|\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2, {\mathbf {Y}})+ P(\gamma _i=0|\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2, {\mathbf {Y}})=1\), after some basic algebra we obtain that

$$\begin{aligned} P(\gamma _i=1|\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2, {\mathbf {Y}}) =\frac{z_i\pi _i}{(1-\pi _i)+z_i\pi _i}\quad \text{ for }\,\, i=1,\ldots ,p, \end{aligned}$$

where

$$\begin{aligned} z_i=\frac{\left[ {\mathbf {Y}}|\gamma _i=1,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2\right] }{\left[ {\mathbf {Y}}|\gamma _i=0,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2\right] }. \end{aligned}$$
(11)

Now we derive the closed form for \(z_i\). Notice that

$$\begin{aligned} \left[ {\mathbf {Y}}|\gamma _i=1,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2\right]= & {} \int [{\mathbf {Y}},\beta _i|\gamma _i=1,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2]d\beta _i \\= & {} \int \left[ {\mathbf {Y}}|\beta _i,\gamma _i=1,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2\right] \left[ \beta _i|\gamma _i \right. \\= & {} \left. 1,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2\right] d\beta _i \\= & {} \int \left[ {\mathbf {Y}}|\varvec{\beta },\sigma ^2\right] \left[ \beta _i|\gamma _i=1\right] d\beta _i. \end{aligned}$$

The closed form of the integrand in the above equation can be calculated from (1) and (2). After some calculations, the integration yields

$$\begin{aligned}{}[{\mathbf {Y}}|\gamma _i= & {} 1,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2]\\= & {} C \exp \{-\frac{R_i^TR_i}{2\sigma ^2}\}\exp \left\{ \frac{(\omega ^{-2}_i+c_i^{-2}\tau _i^{-2})^{-1}\omega ^{-4}_ib^2}{2}\right\} \sqrt{\frac{1}{\omega ^{-2}_ic_i^2\tau _i^2+1}}, \end{aligned}$$

where C is a normalization constant, \( \omega _i^2=\sigma ^2/({\mathbf {X}}_i^T{\mathbf {X}}_i)\), and \(b_i={\mathbf {X}}_i^T{\mathbf {R}}_i/({\mathbf {X}}_i^T{\mathbf {X}}_i)\) with \({\mathbf {R}}_i={\mathbf {Y}}-\sum _{j\ne i}\beta _j{\mathbf {X}}_j\). The closed form of \([{\mathbf {Y}}|\gamma _i=0,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)}, \sigma ^2]\) can be obtained from the right-hand side of the above equation with \(c_i\) being replaced by 1. After some calculations on (11), we have

$$\begin{aligned} z_i=\exp \left\{ \frac{\omega ^{-4}_ib_i^2\tau _i^{-2}(1-c_i^{-2})(\sigma _1^i)^2(\sigma _2^i)^2}{2} \right\} \sqrt{\frac{\omega ^{-2}_i+\tau _i^{-2}}{c_i^2\omega ^{-2}_i+\tau _i^{-2}}}, \end{aligned}$$

where \((\sigma _1^i)^2=(\omega ^{-2}_i+c_i^{-2}\tau _i^{-2})^{-1}\) and \((\sigma _2^i)^2=(\omega ^{-2}_i+\tau _i^{-2})^{-1}\). From (9), we know that \([\beta _i|\gamma _i,\varvec{\beta }_{(-i)}, \varvec{\gamma }_{(-i)},\sigma ^2,{\mathbf {Y}}] \propto [{\mathbf {Y}}|\beta _i, \varvec{\beta }_{(-i)}, \sigma ^2][\beta _i|\gamma _i]\), which can be calculated from (1) and (2). In particular,

$$\begin{aligned}{}[\beta _i|\gamma _i,\varvec{\beta }_{(-i)},\varvec{\gamma }_{(-i)},\sigma ^2,{\mathbf {Y}}] \sim \left\{ \begin{array}{ll} N\left( (\sigma _1^i)^2\omega _i^{-2}b_i, (\sigma _1^i)^2\right) &{} \text{ when } \gamma _i=1;\\ N\left( (\sigma _2^i)^2\omega _i^{-2}b_i, (\sigma _2^i)^2\right) &{} \text{ when } \gamma _i=0. \end{array} \right. \end{aligned}$$

Finally, by the same argument in George and McCulloch (1993), we have

$$\begin{aligned}{}[\sigma ^2|\varvec{\beta }, \varvec{\gamma },{\mathbf {Y}}]\sim \mathrm {IG}\left( \frac{n+v}{2}, \frac{({\mathbf {Y}}-{\mathbf {X}}\varvec{\beta })^T({\mathbf {Y}}-{\mathbf {X}}\varvec{\beta })+v\lambda }{2}\right) , \end{aligned}$$

where \({\mathrm {IG}}\) denotes an inverted gamma distribution. Then from the well-known relation between the inverted gamma distribution and the chi-square distribution (cf., Wu and Hamada 2009), we conclude that

$$\begin{aligned} \left[ (v\lambda +({\mathbf {Y}}-{\mathbf {X}}\varvec{\beta })^T({\mathbf {Y}}-{\mathbf {X}}\varvec{\beta }))/\sigma ^2|\varvec{\beta }, \varvec{\gamma }, {\mathbf {Y}}\right] \sim \chi ^2(v+n). \end{aligned}$$

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, H., Zhou, S., Liu, MQ. et al. Acceleration of the stochastic search variable selection via componentwise Gibbs sampling. Metrika 80, 289–308 (2017). https://doi.org/10.1007/s00184-016-0604-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-016-0604-x

Keywords

Mathematics Subject Classification

Navigation