A moment-matching Ferguson & Klass algorithm

Abstract

Completely random measures (CRM) represent the key building block of a wide variety of popular stochastic models and play a pivotal role in modern Bayesian Nonparametrics. The popular Ferguson & Klass representation of CRMs as a random series with decreasing jumps can immediately be turned into an algorithm for sampling realizations of CRMs or more elaborate models involving transformed CRMs. However, concrete implementation requires to truncate the random series at some threshold resulting in an approximation error. The goal of this paper is to quantify the quality of the approximation by a moment-matching criterion, which consists in evaluating a measure of discrepancy between actual moments and moments based on the simulation output. Seen as a function of the truncation level, the methodology can be used to determine the truncation level needed to reach a certain level of precision. The resulting moment-matching Ferguson & Klass algorithm is then implemented and illustrated on several popular Bayesian nonparametric models.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Argiento, R., Bianchini, I., Guglielmi, A.: A priori truncation method for posterior sampling from homogeneous normalized completely random measure mixture models. preprint. arXiv:1507.04528 (2015)

  2. Argiento, R., Bianchini, I., Guglielmi, A.: A blocked Gibbs sampler for NGG-mixture models via a priori truncation. Stat. Comput. 26(3), 641–661 (2016)

    MathSciNet  Article  MATH  Google Scholar 

  3. Barrios, E., Lijoi, A., Nieto-Barajas, L.E., Prünster, I.: Modeling with normalized random measure mixture models. Stat. Sci. 28(3), 313–334 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  4. Brix, A.: Generalized gamma measures and shot-noise Cox processes. Adv. Appl. Probab. 31, 929–953 (1999)

    MathSciNet  Article  MATH  Google Scholar 

  5. Burden, R., Faires, J.: Numerical Analysis. PWS Publishing Company, Boston (1993)

    Google Scholar 

  6. Campbell, T., Huggins, J., Broderick, T., How, J.: Truncated completely random measures. In: Bayesian Nonparametrics, the Next Generation (NIPS workshop) (2015)

  7. Cont, R., Tankov, P.: Financial Modelling with Jump Processes. Chapman & Hall / CRC Press, London (2008)

    Google Scholar 

  8. Daley, D.J., Vere-Jones, D.: An Introduction to the Theory of Point Processes. Vol. II. General Theory and Structure. Probability and Its Applications. Springer, New York (2008)

  9. De Blasi, P., Favaro, S., Muliere, P.: A class of neutral to the right priors induced by superposition of beta processes. J. Stat. Plan. Inference 140(6), 1563–1575 (2010)

  10. De Blasi, P., Lijoi, A., Prünster, I.: An asymptotic analysis of a class of discrete nonparametric priors. Stat. Sin. 23, 1299–1322 (2012)

  11. Doshi, F., Miller, K., Gael, J. V., and Teh, Y. W.: Variational inference for the Indian buffet process. In: International Conference on Artificial Intelligence and Statistics, pp. 137–144 (2009)

  12. Epifani, I., Lijoi, A., Prünster, I.: Exponential functionals and means of neutral-to-the-right priors. Biometrika 90(4), 791–808 (2003)

    MathSciNet  Article  MATH  Google Scholar 

  13. Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Stat. 1(2), 209–230 (1973)

    MathSciNet  Article  MATH  Google Scholar 

  14. Ferguson, T.S.: Prior distributions on spaces of probability measures. Ann. Stat. 2(4), 615–629 (1974)

    MathSciNet  Article  MATH  Google Scholar 

  15. Ferguson, T.S., Klass, M.J.: A representation of independent increment processes without Gaussian components. Ann. Math. Stat. 43(5), 1634–1643 (1972)

    MathSciNet  Article  MATH  Google Scholar 

  16. Ghahramani, Z., Griffiths, T. L.: Infinite latent feature models and the Indian buffet process. In: Advances in Neural Information Processing Systems 18 (NIPS 2005), pp. 475–482 (2005)

  17. Griffin, J.E.: An adaptive truncation method for inference in Bayesian nonparametric models. Stat. Comput. 26(1–2), 423–441 (2016)

    MathSciNet  Article  MATH  Google Scholar 

  18. Griffin, J.E., Walker, S.G.: Posterior simulation of normalized random measure mixtures. J. Comput. Graph. Stat. 20(1), 241–259 (2011)

    MathSciNet  Article  Google Scholar 

  19. Hjort, N.L.: Nonparametric bayes estimators based on beta processes in models for life history data. Ann. Stat. 18(3), 1259–1294 (1990)

    MathSciNet  Article  MATH  Google Scholar 

  20. Ishwaran, H., James, L.: Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc. 96(453), 161–173 (2001)

    MathSciNet  Article  MATH  Google Scholar 

  21. James, L.F., Lijoi, A., Prünster, I.: Conjugacy as a distinctive feature of the Dirichlet process. Scand. J. Stat. 33(1), 105–120 (2006)

    MathSciNet  Article  MATH  Google Scholar 

  22. James, L.F., Lijoi, A., Prünster, I.: Posterior analysis for normalized random measures with independent increments. Scand. J. Stat. 36(1), 76–97 (2009)

    MathSciNet  Article  MATH  Google Scholar 

  23. Jordan, M.I.: Hierarchical models, nested models and completely random measures. In: Chen, M.-H., Müller, P., Sun, D., Ye, K., Dey, D. (eds.) Frontiers of Statistical Decision Making and Bayesian Analysis: in Honor of James O. Berger, pp. 207–218. Springer, New York (2010)

  24. Lijoi, A., Mena, R.H., Prünster, I.: Controlling the reinforcement in Bayesian non-parametric mixture models. J. R Stat. Soc. B 69(4), 715–740 (2007)

    MathSciNet  Article  Google Scholar 

  25. Lijoi, A., Prünster, I.: Models beyond the Dirichlet process. In: Hjort, N.L., Holmes, C.C., Müller, P., Walker, S.G. (eds.) Bayesian Nonparametrics, pp. 80–136. Cambridge University Press, Cambridge (2010)

    Google Scholar 

  26. Nieto-Barajas, L.E., Prünster, I., Walker, S.G.: Normalized random measures driven by increasing additive processes. Ann. Stat. 32(6), 2343–2360 (2004)

    MathSciNet  Article  MATH  Google Scholar 

  27. Nieto-Barajas, L.E.: Bayesian semiparametric analysis of short- and long-term hazard ratios with covariates. Comput. Stat. Data Anal. 71, 477–490 (2014)

    MathSciNet  Article  Google Scholar 

  28. Nieto-Barajas, L.E., Prünster, I.: A sensitivity analysis for Bayesian nonparametric density estimators. Stat. Sin. 19(2), 685 (2009)

    MathSciNet  MATH  Google Scholar 

  29. Nieto-Barajas, L.E., Walker, S.G.: Markov beta and gamma processes for modelling hazard rates. Scand. J. Stat. 29(3), 413–424 (2002)

    MathSciNet  Article  MATH  Google Scholar 

  30. Nieto-Barajas, L.E., Walker, S.G.: Bayesian nonparametric survival analysis via Lévy driven Markov processes. Stat. Sin. 14(4), 1127–1146 (2004)

    MATH  Google Scholar 

  31. Orbanz, P., and Williamson, S.: Unit-rate poisson representations of completely random measures. Technical report (2012)

  32. Paisley, J. W., Blei, D. M., and Jordan, M. I.: Stick-breaking beta processes and the Poisson process. In: International Conference on Artificial Intelligence and Statistics, pp. 850–858 (2012)

  33. Regazzini, E., Lijoi, A., Prünster, I.: Distributional results for means of normalized random measures with independent increments. Ann. Stat. 31(2), 560–585 (2003)

    MathSciNet  Article  MATH  Google Scholar 

  34. Rosiński, J.: Series representations of Lévy processes from the perspective of point processes. In: Barndorff-Nielsen, O., Mikosch, T., Resnick, S.I. (eds.) Lévy Processes, pp. 401–415. Birkhäuser Boston, Boston, MA (2001)

  35. Teh, Y. W., and Görür, D.: Indian buffet processes with power-law behavior. In: Advances in Neural Information Processing Systems 18 (NIPS 2009), pp. 1838–1846 (2009)

  36. Thibaux, R., and Jordan, M. I.: Hierarchical beta processes and the Indian buffet process. In: International Conference on Artificial Intelligence and Statistics, pp. 564–571 (2007)

  37. Walker, S.G., Damien, P.: Miscellanea. Representations of Lévy processes without Gaussian components. Biometrika 87(2), 477–483 (2000)

    MathSciNet  Article  MATH  Google Scholar 

Download references

Acknowledgments

The authors are grateful to an Associate Editor and two anonymous Referees for valuable comments and suggestions.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Julyan Arbel.

Additional information

Research supported by the European Research Council (ERC) through StG “N-BNP” 306406. Available on arXiv:1606.02566.

Appendices

Appendix 1: Proof of Proposition 1

For any measurable set A of \(\mathbb {X}\), the n-th moment of \(\tilde{\mu }(A)\), if it exists, is given by \(m_n(A) = (-1)^nL_A^{(n)}(0)\), where \(L_A^{(n)}(0)\) denotes the n-th derivative of the Laplace transform \(L_A\) in (2) evaluated at 0. The result is proved by applying Faà di Bruno’s formula to (2) for obtaining the derivatives.\(\square \)

Fig. 6
figure6

Stable-beta process with parameters \(\sigma =0\) and \(\sigma =0.5\). Left Bound in probability \(\tilde{t}_M^\epsilon \) of the tail sum \(T_M\) obtained by direct calculation of the quantiles \(q_j\) with \(\epsilon =10^{-2}\) as the truncation level M increases. Right Bounds \(t_M^\epsilon \) (provided in Proposition 4) and \(\tilde{t}_M^\epsilon \) (obtained by direct calculation of the quantiles \(q_j\)) of the tail sum after M jumps with \(\epsilon =10^{-2}\)

Appendix 2: Evaluation of the tail sum of the stable-beta process

Here we provide an evaluation of the tail sum (11) in the case of the stable-beta process. We start by stating a lemma useful for upper bounding the tail sum.

Lemma 1

Let function \(N(\,\cdot \,)\) be as in (9) for the stable-beta process. Then for any \(\xi >0\)

$$\begin{aligned} N^{-1}(\xi ) \le \left\{ \begin{array}{ll} \mathrm {e}^{\frac{1-\xi /a}{c}} &{}\quad \text {if } \sigma =0,\\ (\alpha \xi +\beta )^{-1/\sigma } &{}\quad \text {if } \sigma \in (0,1), \end{array} \right. \end{aligned}$$

where \(\alpha = \sigma \varGamma (1-\sigma )\frac{\varGamma (c+\sigma )}{a\varGamma (c+1)}\) and \(\beta = 1-\frac{\sigma }{c+\sigma }\varGamma (1-\sigma )\).

Proof

For \(\sigma =0\), from \(u^{-1}(1-u)^{c-1}\le u^{-1}+(1-u)^{c-1}\) one obtains \(\int _{v}^{1} u^{-1}(1-u)^{c-1}\mathrm {d}u\le 1/c -\log v\). Hence, \(N(v)/a\le 1-c\log v\) and \(N^{-1}(\xi )\le \mathrm {e}^{(1-\xi /a)/c}\). The argument for \(\sigma \ne 0\) follows along the same lines starting from \(u^{-1-\sigma }(1-u)^{\sigma +c-1}\le \varGamma (1-\sigma )u^{-\sigma -1}+(1-u)^{\sigma +c-1}\).\(\square \)

Proposition 4

Let \((\xi _j)_{j\ge 1}\) be the jump times for a homogeneous Poisson process on \(\mathbb {R}^+\) with unit intensity. Define the tail sum of the stable-beta process as

$$\begin{aligned} T_M = \sum _{j=M+1}^\infty N^{-1}(\xi _j), \end{aligned}$$

where \(N(\,\cdot \,)\) is given by (9). Then for any \(\epsilon \in (0,1)\),

$$\begin{aligned}&\mathbb {P}\Big (T_M\le t_M^\epsilon \Big ) \ge 1-\epsilon , \text { for } t_M^\epsilon \\&\quad = \left\{ \begin{array}{ll} \frac{C_1}{\epsilon }\mathrm {e}^{\frac{1}{c}-\frac{\epsilon M}{C_1}} &{} \quad \text {if } \sigma =0,\\ \frac{\sigma }{1-\sigma }\frac{(C_2\epsilon )^{1/\sigma }}{(M\,+\,\beta C_2/\epsilon )^{1/\sigma -1}} &{} \quad \text {if } \sigma \in (0,1), \end{array} \right. \end{aligned}$$

where \(C_1=2ac\mathrm {e}\) and \(C_2=2\mathrm {e}/\alpha \) do not depend on \(\epsilon \).

Proof

The proof follows along the same lines as the proof of Theorem A.1. in Brix (1999). Let \(q_j\) denote the \(\epsilon 2^{M-j}\) quantile, for \(j=M+1,M+2,\ldots \), of a gamma distribution with mean and variance equal to j. Then

$$\begin{aligned} \mathbb {P}\bigg (\sum _{j=M+1}^\infty N^{-1}(\xi _j)\le \sum _{j=M+1}^\infty N^{-1}(q_j)\bigg )\ge 1-\epsilon . \end{aligned}$$

An upper bound on \(\tilde{t}_M^\epsilon =\sum _{j=M+1}^\infty N^{-1}(q_j)\) is then found by resorting to Lemma 1 along with the inequality \(q_j\ge \frac{\epsilon }{2\mathrm {e}}j\). If \(\sigma =0\)

$$\begin{aligned} \tilde{t}_M^\epsilon\le & {} \mathrm {e}^{1/c} \sum _{j=M+1}^\infty \mathrm {e}^{-\frac{q_j}{ac}} \le \mathrm {e}^{1/c} \sum _{j=M+1}^\infty \mathrm {e}^{-\frac{\epsilon j}{2ac\mathrm {e}}}\\\le & {} \mathrm {e}^{1/c} \frac{2ac\mathrm {e}}{\epsilon }\mathrm {e}^{-\frac{\epsilon M}{2ac\mathrm {e}}}, \end{aligned}$$

whereas if \(\sigma \ne 0\)

$$\begin{aligned} \tilde{t}_M^\epsilon\le & {} \sum _{j=M+1}^\infty (\alpha q_j+\beta )^{-\frac{1}{\sigma }} \le \sum _{j=M+1}^\infty \left( \frac{\alpha \epsilon j}{2\mathrm {e}}+\beta \right) ^{-\frac{1}{\sigma }} \\= & {} \left( \frac{2\mathrm {e}}{\alpha \epsilon }\right) ^{-\frac{1}{\sigma }}\sum _{j=M+1}^\infty \left( j+\frac{2\mathrm {e}\beta }{\alpha \epsilon }\right) ^{-\frac{1}{\sigma }}. \end{aligned}$$

The result follows by bounding the last sum by \(\int _{M}^{\infty } \left( x+\frac{2\mathrm {e}\beta }{\alpha \epsilon }\right) ^{-\frac{1}{\sigma }}\mathrm {d}x\).\(\square \)

The bound \(t_M^\epsilon \) obtained in Proposition 4 is exponential when \(\sigma =0\) and polynomial when \(\sigma \ne 0\), but it is very conservative as already pointed out by Brix (1999). This finding is further highlighted in the table associated to Fig. 6, where the bound \(t_M^\epsilon \) is computed with appropriate constants derived from the proof. In contrast, the bound \(\tilde{t}_M^\epsilon \) obtained by direct calculation of the quantiles \(q_j\) (instead of resorting to a lower bound on them) is much sharper. Figure 6 displays the sharper bound \(\tilde{t}_M^\epsilon \). Inspection of the plot demonstrates a decrease pattern in this bound in probability which is reminiscent of the ones for the indices \(\ell _M\) and \(e_M\) studied in the paper. This observation is a further indication that the Ferguson & Klass algorithm is a tool with well-behaved approximation error.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Arbel, J., Prünster, I. A moment-matching Ferguson & Klass algorithm. Stat Comput 27, 3–17 (2017). https://doi.org/10.1007/s11222-016-9676-8

Download citation

Keywords

  • Bayesian Nonparametrics
  • Completely random measures
  • Ferguson & Klass algorithm
  • Moment-matching
  • Normalized random measures
  • Posterior sampling