Abstract
Supernovae occur when a star’s life ends in a violent thermonuclear explosion, briefly outshining an entire galaxy before fading from view over a period of weeks or months. Because so-called Type Ia supernovae occur only in a particular physical scenario, their explosions have similar intrinsic brightnesses which allows us to accurately estimate their distances. This in turn allows us to constrain the parameters of cosmological models that characterize the expansion history of the universe. In this paper, we show how a cosmological model can be embedded into a Gaussian hierarchical model and fit using observations of Type Ia supernovae. The overall model is an ideal testing ground of new computational methods. Ancillarity-Sufficiency Interweaving Strategy (ASIS) and Partially Collapsed Gibbs (PCG) are effective tools to improve the convergence of Gibbs samplers. Besides using either of them alone, we can combine PCG and/or ASIS along with Metropolis-Hastings algorithm to simplify implementation and further improve convergence. We use four samplers to draw from the posterior distribution of the cosmological hierarchical model, and confirm the efficiency of both PCG and ASIS. Furthermore, we find that we can gain more efficiency by combining two or more strategies into one sampler.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The raw data are time-series observations of the evolving SN explosion in each of several color bands. These observations are summarized into the apparent magnitude, stretch parameter and color parameter using the SALT-II method (Guy et al., 2007). The apparent magnitude is the peak magnitude in the B-band.
- 2.
We use Python to implement all the samplers, and all the chains are run on a Macbook Pro with a system of OS X 10.8.5 and a processor of 2.5 GHz Intel Core i5.
- 3.
To check convergence, we ran each sampler with three different overdispersed initial values. The Gelman-Rubin statistic (Gelman and Rubin, 1992; Gelman et al., 2013) suggested adequate convergence after only a few iterations for Samplers 2–4. (A burn-in of 1000 iterations was quite conservative for Samplers 2–4.) For Sampler 1, however, a burn-in of 2000 iterations was required.
References
Carroll, R. J., Ruppert, D., Stefanski, L. A., & Crainiceanu, C. (2006). Measurement error in nonlinear models: A modern perspective (2nd ed.). Chapman & Hall/CRC monographs on statistics & applied probability. London: Chapman & Hall/CRC.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis (3rd ed.). Chapman & Hall/CRC texts in statistical science. London: Chapman & Hall/CRC.
Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–472.
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Gilks, W. R., Best, N. G., & Tan, K. K. C. (1995). Adaptive rejection Metropolis sampling within Gibbs sampling. Journal of the Royal Statistical Society, 44, 455–472.
Guy, J., Astier, P., Baumont, S., Hardin, D., Pain, R., Regnault, N., et al. (2007). SALT2: Using distant supernovae to improve the use of type Ia supernovae as distance indicators. Astronomy and Astrophysics, 466, 11–21.
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.
Jiao, X. Y., & van Dyk, D. A. (2016). Combining strategies for improving the performance of Gibbs-type samplers. In preparation.
Kass, R. E., Carlin, B. P., Gelman, A., & Neal, R. M. (1998). Markov Chain Monte Carlo in practice: A roundtable discussion. The American Statistician, 52, 93–100.
Kessler, R. (2009). First-year sloan digital sky survey-II supernova results: Hubble diagram and cosmological parameters. The Astrophysical Journal Supplement, 185, 32–84.
Liu, J. S. (2001). Monte Carlo strategies in scientific computing. New York: Springer.
Liu, J. S., Wong, W. H., & Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to comparisons of estimators and augmentation schemes. Biometrika, 81, 27–40.
Liu, J. S., Wong, W. H., & Kong, A. (1995). Covariance structure and convergence rate of the Gibbs sampler with various scan. Journal of the Royal Statistical Society, Series B. Statistical Methodology, 57, 157–169.
March, M. C., Trotta, R., Berkes, P., Starkman, G. D., & Vaudrevange, P. M. (2011). Improved constraints on cosmological parameters from SNIa data. Monthly Notices of the Royal Astronomical Society, 418, 2308–2329.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equations of state calculations by fast computing machines. The Journal of Chemical Physics, 21, 1087–1092.
Park, T., & van Dyk, D. A. (2009). Partially collapsed Gibbs samplers: Illustrations and applications. Journal of Computational and Graphical Statistics, 18, 283–305.
Phillips, M. M. (1993). The absolute magnitudes of type Ia supernovae. The Astrophysical Journal, 413, L105–L108.
Phillips, M. M., Lira, P., Suntzeff, N. B., Schommer, R. A., Hamuy, M., & Maza, J. (1999). The reddening-free decline rate versus luminosity relationship for type Ia supernovae. Astronomy Journal, 118, 1766–1776.
Robert, C., & Casella, G. (2004). Monte Carlo statistical methods (2nd ed.). Springer texts in statistics. New York: Springer.
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.
van Dyk, D. A., & Jiao, X. Y. (2015). Metropolis-Hastings within partially collapsed Gibbs samplers. Journal of Computational and Graphical Statistics, 24, 301–327.
van Dyk, D. A., & Meng, X. L. (2001). The art of data augmentation (with discussion). Journal of Computational and Graphical Statistics, 10, 1–50.
van Dyk, D. A., & Park, T. (2008). Partially collapsed Gibbs samplers: Theory and methods. Journal of the American Statistical Association, 103, 790–796.
Yu, Y., & Meng, X. L. (2011). To center or not to center: That is not the question—an ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC efficiency (with discussion). Journal of Computational and Graphical Statistics, 20, 531–570.
Acknowledgements
David van Dyk acknowledges partial support for this work from a Wolfson Research Merit Award (WM110023) provided by the British Royal Society and from a Marie-Curie Career Integration Grant (FP7-PEOPLE-2012-CIG-321865) provided by the European Commission; Roberto Trotta acknowledges partial support from an EPSRC “Pathways to Impact” grant.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
This appendix details the sampling steps of the MH within Gibbs, MH within PCG, ASIS and PCG within ASIS samplers, i.e., Samplers 1, 2, 3 and 4, for the cosmological hierarchical model. The posterior distribution of (ξ, X, Ω m , Ω Λ , α, β, R c 2, R x 2, σ res 2) is
where ξ m = (0, 0, M m ) and \(\varSigma _{0} =\mathrm{ Diag}(\sigma _{c_{0}}^{2},\sigma _{x_{0}}^{2},\sigma _{M_{0}}^{2})\); Σ C , Σ P , J, A and L are defined in Sect. 3.2. Setting \(\tilde{X} = AX + L\), the joint posterior distribution of \((\xi,\tilde{X},\varOmega _{m},\varOmega _{\varLambda },\alpha,\beta,R_{c}^{2},R_{x}^{2},\sigma _{\mathrm{res}}^{2})\) is
Furthermore, integrating out (ξ, X), the marginal distribution of (Ω m , Ω Λ , α, β, R c 2, R x 2, σ res 2) is
where Σ A −1 = A T Σ C −1 A +Σ P −1, K −1 = −J T Σ P −1 Σ A Σ P −1 J + J T Σ P −1 J +Σ 0 −1, Δ = A T Σ C −1(Y − L), and k 0 = K(J T Σ P −1 Σ A Δ +Σ 0 −1 ξ m ).
When MH updates are required in any of the samplers, we use truncated normal distributions as the proposal distributions. These distributions are centered at the current draws with variance-covariance matrices adjusted to obtain an acceptance rate of around 25%. The truncation enforces prior constraints and in all cases the MH updates are bivariate.
When generating parameters from a truncated distribution, we repeat drawing from the corresponding unconstrained distribution until the truncation condition is satisfied. In the cosmological example, rejection sampling is not computationally demanding, since the ranges of the prior distributions are fairly large.
The steps of Sampler 1 are
-
1.
Sample (ξ, X), which consists of two sub-steps:
-
Sample ξ from N(k 0, K);
-
Sample X from N(μ A , Σ A ), where μ A = Σ A (Δ +Σ P −1 J ξ).
-
-
2.
Use MH to sample (Ω m , Ω Λ ) from p(Ω m , Ω Λ | Y, ξ, X, α, β, R c 2, R x 2, σ res 2), which is proportional to p(ξ, X, Ω m , Ω Λ , α, β, R c 2, R x 2, σ res 2 | Y ), under the constraint (Ω m , Ω Λ ) ∈ [0, 1] × [0, 2].
-
3.
Sample (α, β) from N(μ B , Σ B ) with constraint (α, β) ∈ [0, 1] × [0, 4], where
$$\displaystyle{ \varSigma _{B}^{-1} = \left [\begin{array}{cc} \sum \limits _{i=1}^{n} \frac{x_{i}^{2}} {\hat{\sigma }_{m_{Bi}}^{2}} & \sum \limits _{i=1}^{n}\frac{-x_{i}c_{i}} {\hat{\sigma }_{m_{Bi}}^{2}} \\ \sum \limits _{i=1}^{n}\frac{-x_{i}c_{i}} {\hat{\sigma }_{m_{Bi}}^{2}} & \sum \limits _{i=1}^{n} \frac{c_{i}^{2}} {\hat{\sigma }_{m_{Bi}}^{2}}\end{array} \right ]\mbox{ and }\mu _{B} =\varSigma _{B}\left [\begin{array}{c} \sum _{i=1}^{n}\frac{x_{i}(M_{i}-\hat{m}_{Bi}+\mu _{i})} {\hat{\sigma }_{m_{Bi}}^{2}} \\ \sum _{i=1}^{n}\frac{-c_{i}(M_{i}-\hat{m}_{Bi}+\mu _{i})} {\hat{\sigma }_{m_{Bi}}^{2}}\end{array} \right ]. }$$(16) -
4.
Sample R c 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(c_{ i}-c_{0})^{2}} {2} \right ]\) with log(R c ) ∈ [−5, 2].
-
5.
Sample R x 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(x_{ i}-x_{0})^{2}} {2} \right ]\) with log(R x ) ∈ [−5, 2].
-
6.
Sample σ res 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(M_{ i}-M_{0})^{2}} {2} \right ]\) with log(σ res) ∈ [−5, 2].
The steps of Sampler 2 are
-
1.
Use MH to sample (Ω m , Ω Λ ) from p(Ω m , Ω Λ | Y, α, β, R c 2, R x 2, σ res 2), which is proportional to p(Ω m , Ω Λ , α, β, R c 2, R x 2, σ res 2 | Y ), with (Ω m , Ω Λ ) ∈ [0, 1] × [0, 2].
-
2.
Use MH to sample (α, β) from p(α, β | Y, Ω m , Ω Λ , R c 2, R x 2, σ res 2), which is proportional to p(Ω m , Ω Λ , α, β, R c 2, R x 2, σ res 2 | Y ), with (α, β) ∈ [0, 1] × [0, 4].
-
3.
Sample (ξ, X), which consists of two sub-steps:
-
Sample ξ from N(k 0, K);
-
Sample X from N(μ A , Σ A ), where μ A = Σ A (Δ +Σ P −1 J ξ).
-
-
4.
Sample R c 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(c_{ i}-c_{0})^{2}} {2} \right ]\) with log(R c ) ∈ [−5, 2].
-
5.
Sample R x 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(x_{ i}-x_{0})^{2}} {2} \right ]\) with log(R x ) ∈ [−5, 2].
-
6.
Sample σ res 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(M_{ i}-M_{0})^{2}} {2} \right ]\) with log(σ res) ∈ [−5, 2].
The steps of Sampler 3 are
-
1.
Sample (ξ, X ⋆), which consists of two sub-steps:
-
Sample ξ from N(k 0, K);
-
Sample X ⋆ from N(μ A , Σ A ), where μ A = Σ A (Δ +Σ P −1 J ξ).
-
-
2.
Use MH to sample (Ω m ⋆, Ω Λ ⋆) from p(Ω m , Ω Λ | Y, ξ, X ⋆, α, β, R c 2, R x 2, σ res 2), which is proportional to p(ξ, X ⋆, Ω m ⋆, Ω Λ ⋆, α, β, R c 2, R x 2, σ res 2 | Y ), under the constraint (Ω m ⋆, Ω Λ ⋆) ∈ [0, 1] × [0, 2];
Use (Ω m ⋆, Ω Λ ⋆) to construct L ⋆.
-
3.
Sample (α ⋆, β ⋆) from N(μ B , Σ B ) with constraint (α ⋆, β ⋆) ∈ [0, 1] × [0, 4];
Use (α ⋆, β ⋆) to construct A ⋆. Then set \(\tilde{X} = A^{\star }X^{\star } + L^{\star }\).
-
4.
Use MH to sample (Ω m , Ω Λ ) from \(p(\varOmega _{m},\varOmega _{\varLambda }\vert Y,\xi,\tilde{X},\alpha ^{\star },\beta ^{\star },R_{c}^{2},R_{x}^{2},\sigma _{\mathrm{res}}^{2})\), which is proportional to \(p(\xi,\tilde{X},\varOmega _{m},\varOmega _{\varLambda },\alpha ^{\star },\beta ^{\star },R_{c}^{2},R_{x}^{2},\sigma _{\mathrm{res}}^{2}\vert Y )\), under the constraint (Ω m , Ω Λ ) ∈ [0, 1] × [0, 2];
Use (Ω m , Ω Λ ) to construct L.
-
5.
Sample (α, β) from N(μ D , Σ D ) with constraint (α, β) ∈ [0, 1] × [0, 4], where
$$\displaystyle{ \varSigma _{D}^{-1} = \left [\begin{array}{cc} \sum \limits _{i=1}^{n} \frac{\tilde{x}_{i}^{2}} {\sigma _{\mathrm{res}}^{2}} & \sum \limits _{i=1}^{n}\frac{-\tilde{x}_{i}\tilde{c}_{i}} {\sigma _{\mathrm{res}}^{2}} \\ \sum \limits _{i=1}^{n}\frac{-\tilde{x}_{i}\tilde{c}_{i}} {\sigma _{\mathrm{res}}^{2}} & \sum \limits _{i=1}^{n} \frac{\tilde{c}_{i}^{2}} {\sigma _{\mathrm{res}}^{2}}\end{array} \right ]\mbox{ and }\mu _{D} =\varSigma _{D}\left [\begin{array}{c} \sum _{i=1}^{n}\frac{\tilde{x}_{i}(M_{0}-\tilde{M}_{i})} {\sigma _{\mathrm{res}}^{2}} \\ \sum _{i=1}^{n}\frac{-\tilde{c}_{i}(M_{0}-\tilde{M}_{i})} {\sigma _{\mathrm{res}}^{2}}\end{array} \right ], }$$(17)where \(\tilde{c}_{i}\), \(\tilde{x}_{i}\) and \(\tilde{M}_{i}\) are the (3i − 2)th, (3i − 1)th and (3i)th components of \((\tilde{X} - L)\);
Use (α, β) to construct A. Then set \(X = A^{-1}(\tilde{X} - L)\).
-
6.
Sample R c 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(c_{ i}-c_{0})^{2}} {2} \right ]\) with log(R c ) ∈ [−5, 2].
-
7.
Sample R x 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(x_{ i}-x_{0})^{2}} {2} \right ]\) with log(R x ) ∈ [−5, 2].
-
8.
Sample σ res 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(M_{ i}-M_{0})^{2}} {2} \right ]\) with log(σ res) ∈ [−5, 2].
The steps of Sampler 4 are
-
1.
Use MH to sample (Ω m , Ω Λ ) from p(Ω m , Ω Λ | Y, α, β, R c 2, R x 2, σ res 2), which is proportional to p(Ω m , Ω Λ , α, β, R c 2, R x 2, σ res 2 | Y ) with (Ω m , Ω Λ ) ∈ [0, 1] × [0, 2]; Use (Ω m , Ω Λ ) to construct L.
-
2.
Sample (ξ, X ⋆), which consists of two sub-steps:
-
Sample ξ from N(k 0, K);
-
Sample X ⋆ from N(μ A , Σ A ), where μ A = Σ A (Δ +Σ P −1 J ξ).
-
-
3.
Sample (α ⋆, β ⋆) from N(μ B , Σ B ) with constraint (α ⋆, β ⋆) ∈ [0, 1] × [0, 4];
Use (α ⋆, β ⋆) to construct A ⋆. Then set \(\tilde{X} = A^{\star }X^{\star } + L\).
-
4.
Sample (α, β) from N(μ D , Σ D ) with constraint (α, β) ∈ [0, 1] × [0, 4];
Use (α, β) to construct A. Then set \(X = A^{-1}(\tilde{X} - L)\).
-
5.
Sample R c 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(c_{ i}-c_{0})^{2}} {2} \right ]\) with log(R c ) ∈ [−5, 2].
-
6.
Sample R x 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(x_{ i}-x_{0})^{2}} {2} \right ]\) with log(R x ) ∈ [−5, 2].
-
7.
Sample σ res 2 from \(\mbox{ Inv-Gamma}\left [\frac{n} {2}, \frac{\sum _{i=1}^{n}(M_{ i}-M_{0})^{2}} {2} \right ]\) with log(σ res) ∈ [−5, 2].
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Jiao, X., van Dyk, D.A., Trotta, R., Shariff, H. (2016). The Efficiency of Next-Generation Gibbs-Type Samplers: An Illustration Using a Hierarchical Model in Cosmology. In: Jin, Z., Liu, M., Luo, X. (eds) New Developments in Statistical Modeling, Inference and Application. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-42571-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-42571-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42570-2
Online ISBN: 978-3-319-42571-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)