A blocked Gibbs sampler for NGG-mixture models via a priori truncation

Argiento, Raffaele; Bianchini, Ilaria; Guglielmi, Alessandra

doi:10.1007/s11222-015-9549-6

A blocked Gibbs sampler for NGG-mixture models via a priori truncation

Published: 13 February 2015

Volume 26, pages 641–661, (2016)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Raffaele Argiento¹,
Ilaria Bianchini² &
Alessandra Guglielmi²

543 Accesses
14 Citations
Explore all metrics

Abstract

We define a new class of random probability measures, approximating the well-known normalized generalized gamma (NGG) process. Our new process is defined from the representation of NGG processes as discrete measures where the weights are obtained by normalization of the jumps of Poisson processes and the support consists of independent identically distributed location points, however considering only jumps larger than a threshold $\varepsilon $. Therefore, the number of jumps of the new process, called $\varepsilon $-NGG process, is a.s. finite. A prior distribution for $\varepsilon $ can be elicited. We assume such a process as the mixing measure in a mixture model for density and cluster estimation, and build an efficient Gibbs sampler scheme to simulate from the posterior. Finally, we discuss applications and performance of the model to two popular datasets, as well as comparison with competitor algorithms, the slice sampler and a posteriori truncation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Truncated Poisson–Dirichlet approximation for Dirichlet process hierarchical models

Article Open access 04 January 2023

A New Finite Approximation for the NGG Mixture Model: An Application to Density Estimation

Model-based clustering based on sparse finite Gaussian mixtures

Article Open access 26 August 2014

References

Argiento, R., Guglielmi, A., Pievatolo, A.: Bayesian density estimation and model selection using nonparametric hierarchical mixtures. Comput. Stat. Data Anal. 54, 816–832 (2010)
Article MathSciNet MATH Google Scholar
Argiento, R., Cremaschi, A., Guglielmi, A.: A “density-based” algorithm for cluster analysis using species sampling Gaussian mixture models. J. Comput. Graph. Stat. 23, 1126–1142 (2014)
Article MathSciNet Google Scholar
Barrios, E., Lijoi, A., Nieto-Barajas, L.E., Prünster, I.: Modeling with normalized random measure mixture models. Stat. Sci. 28, 313–334 (2013)
Article MathSciNet MATH Google Scholar
Bianchini, I.: A Bayesian nonparametric model for density and cluster estimation: the $\varepsilon $-NGG mixture model. Tesi di laurea magistrale, Ingegneria Matematica, Politecnico di Milano (2014a)
Bianchini, I.: A new finite approximation for the NGG mixture model: an application to density estimation. In: The Contribution of Young Researchers to Bayesian Statistics: Proceedings of BAYSM2014. Springer, Berlin (2015)
Caron, F.: Bayesian nonparametric models for bipartite graphs. In: NIPS, pp. 2060–2068 (2012)
Caron, F., Fox, E.B.: Bayesian nonparametric models of sparse and exchangeable random graphs. arXiv preprint. arXiv:1401.1137 (2014)
Chen, C., Ding, N., Buntine, W.: Dependent hierarchical normalized random measures for dynamic topic modeling. arXiv preprint. arXiv:1206.4671 (2012)
Cho, R.J., Campbell, M.J., Winzeler, E.A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T.G., Gabrielian, A.E., Landsman, D., Lockhart, D.J., Davis, R.W.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell. 2, 65–73 (1998)
Article Google Scholar
Escobar, M., West, M.: Bayesian density estimation and inference using mixtures. J. Am. Stat. Assoc. 90, 577–588 (1995)
Article MathSciNet MATH Google Scholar
Favaro, S., Teh, Y.: MCMC for normalized random measure mixture models. Stat. Sci. 28(3), 335–359 (2013)
Article MathSciNet MATH Google Scholar
Favaro, S., Walker, S.G.: Slice sampling $\sigma $-stable Poisson-Kingman mixture models. J. Comput. Graph. Stat. 22(4), 830–847 (2013)
Article MathSciNet Google Scholar
Favaro, S., Guglielmi, A., Walker, S.: A class of measure-valued Markov chains and Bayesian nonparametrics. Bernoulli 18(3), 1002–1030 (2012)
Article MathSciNet MATH Google Scholar
Ferguson, T.S., Klass, M.: A representation of independent increment processes without Gaussian components. Ann. Math. Stat. 43, 1634–1643 (1972)
Article MathSciNet MATH Google Scholar
Gelfand, A.E., Kottas, A.: A computational approach for full nonparametric Bayesian inference under Dirichlet process mixture models. J. Comput. Graph. Stat. 11, 289–305 (2002)
Article MathSciNet Google Scholar
Gradshteyn, I., Ryzhik, L.: Table of Integrals, Series, and Products, 6th edn. Academic Press, San Diego (2000)
MATH Google Scholar
Griffin, J., Walker, S.G.: Posterior simulation of normalized random measure mixtures. J. Comput. Graph. Stat. 20, 241–259 (2011)
Article MathSciNet Google Scholar
Griffin, J.E.: An adaptive truncation method for inference in Bayesian nonparametric models. Stat. Comput. doi:10.1007/s11222-014-9519-4 (2014)
Griffin, J.E., Kolossiatis, M., Steel, M.F.: Comparing distributions by using dependent normalized random-measure mixtures. J. R. Stat. Soc. B 75(3), 499–529 (2013)
Article MathSciNet Google Scholar
Ishwaran, H., James, L.: Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc. 96, 161–173 (2001)
Article MathSciNet MATH Google Scholar
Ishwaran, H., Zarepour, M.: Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models. Biometrika 87, 371–390 (2000)
Article MathSciNet MATH Google Scholar
Ishwaran, H., Zarepour, M.: Exact and approximate sum representations for the Dirichlet process. Can. J. Stat. 30, 269–283 (2002)
James, L., Lijoi, A., Prünster, I.: Posterior analysis for normalized random measures with independent increments. Scand. J. Stat. 36, 76–97 (2009)
Article MathSciNet MATH Google Scholar
Kallenberg, O.: Random Measures, 4th edn. Akademie, Berlin (1983)
MATH Google Scholar
Kalli, M., Griffin, J.E., Walker, S.G.: Slice sampling mixture models. Stat. Comput. 21(1), 93–105 (2011)
Article MathSciNet MATH Google Scholar
Kingman, J.F.C.: Poisson Processes, vol. 3. Oxford University Press, Oxford (1993)
Lijoi, A., Mena, R.H., Prünster, I.: Controlling the reinforcement in Bayesian nonparametric mixture models. J. R. Stat. Soc. B 69, 715–740 (2007)
Article Google Scholar
Lijoi, A., Prunster, I., Walker, S.G.: Investigating nonparametric priors with Gibbs structure. Stat. Sin. 18, 1653–1668 (2008)
MathSciNet MATH Google Scholar
Lijoi, A., Nipoti, B., Prunster, I.: Bayesian inference with dependent normalized completely random measures. Bernoulli 20, 1260–1291 (2014)
Article MathSciNet MATH Google Scholar
MacEachern, S.N.: Computational methods for mixture of Dirichlet process models. In: Practical Nonparametric and Semiparametric Bayesian Statistics. Lecture Notes in Statistics, vol. 133, pp. 23–43. Springer, New York (1998)
Muliere, P., Tardella, L.: Approximating distributions of random functionals of Ferguson-Dirichlet priors. Can. J. Stat. 26(2), 283–297 (1998)
Article MathSciNet MATH Google Scholar
Neal, R.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9, 249–265 (2000)
MathSciNet Google Scholar
Papaspiliopoulos, O., Roberts, G.O.: Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95, 169–186 (2008)
Article MathSciNet MATH Google Scholar
Pitman, J.: Some developments of the Blackwell-Macqueen urn scheme. In: Ferguson TS, Shapley LS, Macqueen JB (eds) Statistics, Probability and Game Theory: Papers in Honor of David Blackwell. IMS Lecture Notes-Monograph Series, vol. 30, pp. 245–267. Institute of Mathematical Statistics, Hayward (1996)
Pitman, J.: Poisson-Kingman partitions. In: Science and Statistics: A Festschrift for Terry Speed. IMS Lecture Notes-Monograph Series, vol. 40, pp. 1–34. Institute of Mathematical Statistics, Hayward (2003)
Pitman, J.: Combinatorial Stochastic Processes. Lecture Notes in Mathematics, vol. 1875, pp. 1–34. Springer, New York (2006)
Regazzini, E., Lijoi, A., Prünster, I.: Distributional results for means of random measures with independent increments. Ann. Stat. 31, 560–585 (2003)
Article MathSciNet MATH Google Scholar
Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sinica 4(2), 639–650 (1994)
MathSciNet MATH Google Scholar
Walker, S.G.: Sampling the Dirichlet mixture model with slices. Commun. Stat. Simulat. 36, 45–54 (2007)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors would like to thank the referees and the Associate Editor for their valuable help in improving this manuscript. Ilaria Bianchini was partially funded by CNR-IMATI Flagship project - Factory of the future - Fab at Hospital.

Author information

Authors and Affiliations

CNR-IMATI, via Bassini 15, 20133, Milan, Italy
Raffaele Argiento
Politecnico di Milano, P.zza Leonardo da Vinci 32, 20133, Milan, Italy
Ilaria Bianchini & Alessandra Guglielmi

Authors

Raffaele Argiento
View author publications
You can also search for this author in PubMed Google Scholar
Ilaria Bianchini
View author publications
You can also search for this author in PubMed Google Scholar
Alessandra Guglielmi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raffaele Argiento.

Appendices

1.1 Appendix 1: Proof of (5)

First observe that, since $N_\varepsilon $ has a Poisson distribution with parameter $\varLambda _\varepsilon $, we have

$$\begin{aligned} p_{\varepsilon }(n_1,\dots ,n_k) =\sum _{N_\varepsilon =0}^{+\infty } p_{\varepsilon }(n_1,\dots ,n_k|N_\varepsilon ) \frac{\varLambda _{\varepsilon }^{N_\varepsilon }}{N_\varepsilon !}{\mathrm{e}}^{-\varLambda _{\varepsilon }}. \end{aligned}$$

(12)

Then, formula (30) in Pitman (1996) yields

$$\begin{aligned} p_{\varepsilon }(n_1,\dots ,n_k|N_\varepsilon )= \mathbb {I}_{ \{1,\dots ,N_\varepsilon +1\} }(k) \sum _{j_1,\dots ,j_k}^{}{\mathbb {E}}\left( \prod _{i=1}^{k}P_{j_i}^{n_i} \right) , \end{aligned}$$

where the vector $(j_1,\dots ,j_k)$ ranges over all permutations of $k$ elements in $\{0,\dots ,N_\varepsilon \}$. Using the gamma function identity,

$$\begin{aligned} \dfrac{1}{T_\varepsilon ^n}=\int _{0}^{+\infty }\frac{1}{\varGamma (n)}u^{n-1} e^{-u T_\varepsilon } du, \end{aligned}$$

(13)

we have:

$$\begin{aligned}&p_{\varepsilon }(n_1,\ldots ,n_k|N_\varepsilon ) \\&\quad = \sum _{j_1,\dots ,j_k} \int _{}^{} \prod _{i=1}^{k}\frac{J_{j_i}^{n_i}}{T_{\varepsilon }^{n_i}} \fancyscript{L}(dJ_0,\dots ,dJ_{N_\varepsilon })\\&\quad {=}\!\! \sum _{j_1,\dots ,j_k} \int _{0}^{+\infty } du \left( \frac{u^{n-1}}{\varGamma (n)} . \prod _{i=1}^{k}\int _{0}^{+\infty }J_{j_i}^{n_i}{\mathrm{e}}^{-J_{j_i}u}\rho _{\varepsilon }(J_j)dJ_{j_i} \right. \\&\qquad \qquad \qquad \qquad \qquad \quad \times \!\!\left. \prod _{j\notin \{j_1,\dots ,j_k\}}^{}\int _{0}^{+\infty }{\mathrm{e}}^{-J_{j}u}\rho _{\varepsilon }(J_j)dJ_j \right) \end{aligned}$$

Substituting the expression of $\rho _{\varepsilon }$ in Sect. 3:

$$\begin{aligned}&p_{\varepsilon }(n_1,\ldots ,n_k|N_\varepsilon ) \\&\quad = \sum _{j_1,\dots ,j_k} \int _{0}^{+\infty } du \Big (\frac{1}{\varGamma (n)}u^{n-1} \\&\qquad \times \prod _{i=1}^{k} \int _{\varepsilon }^{+\infty } \frac{J_{j_i}}{\omega ^\sigma \varGamma (-\sigma ,\omega \varepsilon )}J_{j_i}^{-\sigma -1}{\mathrm{e}}^{-(\omega +u)J_j}dJ_{j_i} \\&\qquad \times \prod _{j\notin \{j_1,\dots ,j_k\}}^{}\int _{0}^{+\infty }\!\!\!\frac{1}{\omega ^\sigma \varGamma (-\sigma ,\omega \varepsilon )} J_{j}^{-\sigma -1}{\mathrm{e}}^{-(\omega +u)J_j}dJ_{j} \Big )\\&\quad = \sum _{j_1,\dots ,j_k} \int _{0}^{+\infty } \Big ( \frac{1}{\varGamma (n)}u^{n-1} \\&\qquad \times \prod _{i=1}^{k} \frac{(u+\omega )^{\sigma -n_i}\varGamma (n_i-\sigma ;(u+\omega )\varepsilon )}{\omega ^{\sigma } \varGamma (-\sigma ,\omega \varepsilon )} \\&\qquad \times \left( \frac{(u+\omega )^{\sigma }\varGamma (-\sigma ;(u+\omega )\varepsilon )}{\omega ^{\sigma }\varGamma (-\sigma ,\omega \varepsilon )} \right) ^ {N_\varepsilon +1-k} \Big ) du. \end{aligned}$$

We have crossed out the indicator function in all previous lines. If we switch the finite sum and the integral, since the integrand function does not depend on the position of the clusters $j_i$’s, $i=1,\dots ,k$, but only on the sizes $n_i$, and there are $(N_\varepsilon +1)(N_\varepsilon )\dots (N_\varepsilon +1-k)={(N_\varepsilon +1)!}/(N_{\varepsilon }+1-k)!$ sequences of $k$ distinct elements from $\{0,\dots ,N_\varepsilon \}$, we get:

$$\begin{aligned}&p(n_1,\dots ,n_k|N_{\varepsilon }) =\mathbb {I}_{ \{1,\dots ,N_{\varepsilon }+1\} }(k) \int _{0}^{+\infty }\Bigg ( \frac{u^{n-1}}{\varGamma (n)} \\&\quad \times \frac{(N_\varepsilon +1)!}{(N_{\varepsilon }+1-k)!} \prod _{i=1}^{k} \frac{(u+\omega )^{\sigma -n_i}\varGamma (n_i-\sigma ;(u+\omega )\varepsilon )}{\omega ^{\sigma } \varGamma (-\sigma ,\omega \varepsilon )} \\&\quad \times \left( \frac{(u+\omega )^{\sigma }\varGamma (-\sigma ;(u+\omega )\varepsilon )}{\omega ^{\sigma }\varGamma (-\sigma ,\omega \varepsilon )} \right) ^{N_{\varepsilon }+1-k}\Bigg ) du. \end{aligned}$$

Observe that, because of the indicator function in the above formula, summation in (12) has to be taken for $N_\varepsilon $ from $k-1$ to $+\infty $. Then, by the change of variable $N_{na}=N_\varepsilon +1-k$ in the summation ($N_\varepsilon +1-k$ is the number of non-allocated jumps), simple calculations give

$$\begin{aligned}\begin{aligned}&\! p_{\varepsilon }(n_1,\dots ,n_k)\\&\!\!\!\quad =\!\sum _{N_{na}=0}^{+\infty } \int _{0}^{+\infty } \!\left( \frac{u^{n-1}}{\varGamma (n)} (u\!+\!\omega )^{k\sigma -n} \prod _{i=1}^{k}\varGamma (n_i\!-\!\sigma ,(u\!+\!\omega )\varepsilon ) \right. \\&\!\!\!\qquad \times \frac{1}{\omega ^\sigma \varGamma (-\sigma ,\omega \varepsilon )} \frac{\kappa ^{k-1}}{\varGamma (1-\sigma )^{k-1}} \frac{N_{na}+k}{N_{na}!}\\&\!\!\!\qquad \times \left. \left( \frac{\kappa (u+\omega )^\sigma }{\varGamma (1-\sigma )} \varGamma (-\sigma ,(u+\omega )\varepsilon ) \right) ^{N_{na}} {\mathrm{e}}^{-\varLambda _{\varepsilon }} \right) du.\\ \end{aligned} \end{aligned}$$

By Fubini’s theorem, we can switch integration and summation, and introduce $\varLambda _{\varepsilon ,u}$ as defined in (6), so that

$$\begin{aligned}&p_{\varepsilon }(n_1,\dots ,n_k)= \int _{0}^{+\infty } \bigg ( \frac{u^{n-1}}{\varGamma (n)} (u+\omega )^{k\sigma -n} \\&\quad \times \prod _{i=1}^{k}\varGamma (n_i-\sigma ,(u+\omega )\varepsilon ) \frac{1}{\omega ^\sigma \varGamma (-\sigma ,\omega \varepsilon )} \frac{\kappa ^{k-1}}{\varGamma (1-\sigma )^{k-1}} \\&\quad \times \sum _{N_{na}=0}^{+\infty } \frac{N_{na} +k}{N_{na}!} \varLambda _{\varepsilon ,u}^{N_{na}} {\mathrm{e}}^{-\varLambda _{\varepsilon }} \bigg ) du, \end{aligned}$$

that is (5), since

$$\begin{aligned} \sum _{N_{na}=0}^{+\infty }\frac{N_{na}+k}{N_{na}!}\varLambda _{\varepsilon ,u}^{N_{na}}= {\mathrm{e}}^{\varLambda _{\varepsilon ,u}}\left( \varLambda _{\varepsilon ,u}+k \right) . \end{aligned}$$

1.2 Appendix 2: Lemma 1

We report this simple lemma for a thorough understanding of Lemma 2.

Lemma 1

Let $(a_n)$ and $(b_n)$ be two sequences of real numbers, such that

$$\begin{aligned} \lim _{n\rightarrow +\infty }(a_n+b_n)=l, \quad \liminf _{n\rightarrow +\infty } a_{n}=a_0, \quad \liminf _{n\rightarrow +\infty } b_{n}=b_0, \end{aligned}$$

where $l,a_0,b_0$ are finite, and $a_0+b_0=l$. Then

$$\begin{aligned} \lim _{n\rightarrow +\infty } a_{n}=a_0, \quad \lim _{n\rightarrow +\infty } b_{n}=b_0. \end{aligned}$$

Proof

By definition of $\liminf $ and $\limsup $ we have:

$$\begin{aligned}&\liminf a_n+\liminf b_n \le \liminf (a_n+b_n)\\&\le \liminf a_n +\limsup b_n \\&\le \limsup (a_n+b_n) \le \limsup a_n +\limsup b_n. \end{aligned}$$

From the hypotheses we have

$$\begin{aligned} a_0+b_0=l&= \liminf (a_n+b_n) \le a_0+\limsup b_n\\&\le \limsup (a_n+b_n)=l=a_0+b_0, \end{aligned}$$

so that $\limsup b_n=b_0$, but by hypothesis $b_0=\liminf b_n$, and consequently

$$\begin{aligned} \lim _{n \rightarrow +\infty } b_n =b_0. \end{aligned}$$

We prove similarly that $\lim _{n\rightarrow +\infty }a_{n}=a_0$.

Of course, this lemma can be generalized to any finite number of sequences.

1.3 Appendix 3: Lemma 2

Lemma 2

Let $p_\varepsilon $ be the eppf of a $\varepsilon -$NGG$(\sigma ,\kappa ,\omega , P_0)$ process. Then for each $n_1,\dots ,n_k\ \in \mathbb {N}$ with $k\ge 0$ and $\sum _{i=1}^{k}n_i$ $=n$, we have

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}p_{\varepsilon }(n_1,\dots ,n_k)=p_0(n_1,\dots ,n_k), \end{aligned}$$

(14)

where $p_0(\cdot )$ is the eppf of a NGG$(\sigma ,\kappa ,\omega ,P_0)$ process.

Proof

Formula (5) can be restated as

$$\begin{aligned} p_{\varepsilon }(n_1,\dots ,n_k)=\int _{0}^{+\infty }f_{\varepsilon }(u;n_1,\dots ,n_k)du \end{aligned}$$

where $f_{\varepsilon }$ denotes the integrand in the right hand side of (5) itself. In addition, the eppf of a NGG$(\sigma ,\kappa ,\omega , P_0)$ process can be written as

$$\begin{aligned} p_{0}(n_1,\dots ,n_k)=\int _{0}^{+\infty } f_{0}(u;n_1,\dots ,n_k)du \end{aligned}$$

where

$$\begin{aligned}&f_0(u;n_1,\dots ,n_k) = \frac{u^{n-1}}{\varGamma (n)} \left( u+\omega \right) ^{k\sigma -n} \prod _{i=1}^{k} \varGamma (n_i-\sigma )\\&\quad \times \left( \frac{\kappa }{\varGamma (1-\sigma )} \right) ^{k-1}\frac{\kappa }{\varGamma (1-\sigma )} \exp \left\{ \!-\!\kappa \frac{(\omega \!+\!u)^\sigma \!-\!\omega ^\sigma }{\sigma } \right\} ; \end{aligned}$$

see, for instance, Lijoi et al. (2007). We first show that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}f_{\varepsilon }(u;n_1,\dots ,n_k)= f_{0}(u;n_1,\dots ,n_k)\quad \mathrm{for \, any }\, u>0. \end{aligned}$$

This is straightforward by the following remarks:

1.
$\lim _{\varepsilon \rightarrow 0}\varGamma (n_i-\sigma ,(u+\omega )\varepsilon )=\varGamma (n_i-\sigma )$, for any $i=1,2,\ldots ,k$, by the Dominated Convergence Theorem, since $n_i-\sigma \ge 1-\sigma >0$;
2.
since $\lim _{\varepsilon \rightarrow 0}\varGamma (-\sigma ,\omega \varepsilon )=+\infty $ and
$$\begin{aligned} \varGamma (1-\sigma ,x)=-\sigma \varGamma (-\sigma ,x) + x^{-\sigma }{\mathrm{e}}^{-x} \end{aligned}$$
(Gradshteyn and Ryzhik 2000), we have:
$$\begin{aligned} \begin{aligned}&\lim _{\varepsilon \rightarrow 0} \frac{\varLambda _{\varepsilon ,u}+k}{\omega ^\sigma \varGamma (-\sigma ,\omega \varepsilon )}= \frac{\kappa }{\varGamma (1-\sigma )}, \\&\lim _{\varepsilon \rightarrow 0} \left( \varLambda _{\varepsilon ,u}-\varLambda _{\varepsilon } \right) = -\kappa \frac{(\omega +u)^\sigma -\omega ^\sigma }{\sigma }. \end{aligned} \end{aligned}$$

Now let $\fancyscript{C}=\{C_1,\dots ,C_k\}$ be a partition of $\{ 1, \dots , n \}$ with group sizes $(n_1,\dots ,n_k)$, and let $\Pi _n$ be the set all the possible partitions of $\{ 1, \dots , n \}$, of any size $k=1,\ldots ,n$. Of course, by definition of eppf,

$$\begin{aligned} \sum _{\fancyscript{C} \in \Pi _n}^{} p(n_1,\dots ,n_k)=1 \end{aligned}$$

and, in particular this holds for either $p_\varepsilon $ and $p_0$. Moreover, by Fatou’s Lemma we have

$$\begin{aligned} p_0(n_1,\dots ,n_k)&=\int _{0}^{+\infty } \lim _{\varepsilon \rightarrow 0} f_{\varepsilon }(u;n_1,\dots ,n_k) du \\&=\int _{0}^{+\infty } \liminf _{\varepsilon \rightarrow 0} f_{\varepsilon }(u;n_1,\dots ,n_k) du \\&\le \liminf _{\varepsilon \rightarrow 0} \int _{0}^{+\infty } f_{\varepsilon }(u;n_1,\dots ,n_k) du \\&= \liminf _{\varepsilon \rightarrow 0} p_{\varepsilon }(n_1,\dots ,n_k). \end{aligned}$$

Suppose now that for a particular sequence $\fancyscript{C}\in \Pi _n$, we have ${\displaystyle p_0(n_1,\dots ,n_k)< \liminf _{\varepsilon \rightarrow 0}p_{\varepsilon }(n_1,\dots ,n_k)}$. In this case

$$\begin{aligned} \begin{aligned}&1 = \sum _{\fancyscript{C} \in \Pi _n}^{}p_0(n_1,\ldots ,n_k) < \sum _{\fancyscript{C} \in \Pi _n}^{} \liminf _{\varepsilon \rightarrow 0} p_{\varepsilon }(n_1,\ldots ,n_k) \\&\quad \le \liminf _{\varepsilon \rightarrow 0} \sum _{\fancyscript{C} \in \Pi _n}^{} p_{\varepsilon }(n_1,\dots ,n_k)=1, \end{aligned} \end{aligned}$$

that is a contradiction. Therefore we conclude that

$$\begin{aligned} p_0(n_1,\dots ,n_k)= \liminf _{\varepsilon \rightarrow 0} p_{\varepsilon }(n_1,\dots ,n_k), \end{aligned}$$

for all $n_1,\ldots ,n_k$, all $k$. Summing up, we have proved so far that:

$\lim _{\varepsilon \rightarrow 0}\sum _{\fancyscript{C} \in \Pi _n}^{}p_{\varepsilon }(n_1,\dots ,n_k) =1$;
$\liminf _{\varepsilon \rightarrow 0} p_\varepsilon (n_1,\dots ,n_k)=p_0(n_1,\dots ,n_k)$ for all $\fancyscript{C}=(C_1$, $\dots ,C_k) \in \Pi _n$;
$\sum _{\fancyscript{C} \in \Pi _n}^{} p_0(n_1,\dots ,n_k)=1$.

By Lemma 1, equation (14) follows.

1.4 Appendix 4: Proof of Proposition 1

As mentioned before, $P_\varepsilon $ is a proper species sampling model, so that $p_{\varepsilon }$ defines a probability law on the sets of all partitions of $\mathbb {N}_n:=\{1,\dots ,n\}$, once that we have set a positive integer $n$. Therefore, we introduce $(N_1^\varepsilon ,\ldots ,N_k^\varepsilon )$, the sizes of the blocks (in order of appearance), of the random partition $C_{\varepsilon ,n}$ defined by $p_\varepsilon $, for any $\varepsilon \ge 0$. The probability distributions of $\{(N_1^\varepsilon ,\ldots ,N_k^\varepsilon ),\varepsilon \ge 0 \}$ are proportional to the values of $p_\varepsilon $ (for any $\varepsilon \ge 0$) in (2.6) in Pitman (2006). Hence, by Lemma 2, for any $k=1,\ldots ,n$ and any $n$,

$$\begin{aligned} (N_1^\varepsilon ,\ldots ,N_k^\varepsilon ) \mathop {\rightarrow }\limits ^{d}(N_1^0,\ldots ,N_k^0) \, \mathrm{as }\, \varepsilon \rightarrow 0. \end{aligned}$$

Here $(N_1^0,\ldots ,N_k^0)$ denotes the sizes of the blocks (in order of appearance), of the random partition $C_{0,n}$ defined by $p_0$, the eppf of a NGG$(\sigma ,\kappa ,\omega , P_0)$ process. By formula (2.30) in Pitman (2006), we have

where $P_j^\varepsilon $ and $\tilde{P}_j$ are the $j$-th weights of a $\varepsilon $-NGG and a NGG process (with parameters $(\sigma ,\kappa ,\omega , P_0)$), respectively. Note that the sequences depending on $n$ have only a finite number of positive weights. Recall that the weak convergence of a sequence of random probability measures is equivalent to the pointwise convergence of the Laplace transforms (see Kallenberg 1983, Theorem 4.2). Let $f(\cdot )$ be a continuous and bounded function on $\varTheta $. If we can invert the order of the limit operations below, then we have:

$$\begin{aligned} \begin{aligned}&\lim _{\varepsilon \rightarrow 0}{\mathbb {E}}\left( {\mathrm{e}}^{-\int _\varTheta f d\mu ^{\varepsilon }} \right) \\&\quad = \lim _{\varepsilon \rightarrow 0}\lim _{n\rightarrow +\infty } {\mathbb {E}}\left( {\mathrm{e}}^{-\int _\varTheta f d \mu _n^{\varepsilon }}\right) \\&\quad =\lim _{n\rightarrow +\infty }\lim _{\varepsilon \rightarrow 0} {\mathbb {E}}\left( {\mathrm{e}}^{-\int _\varTheta fd \mu _n^{\varepsilon } }\right) \\&\quad = \lim _{n\rightarrow +\infty } {\mathbb {E}}\left( {\mathrm{e}}^{-\int _{}^{}fd \mu _n^0 }\right) = {\mathbb {E}}\left( {\mathrm{e}}^{-\int _{}^{}fd \mu ^0}\right) . \end{aligned} \end{aligned}$$

(15)

Here we have introduced notation

$$\begin{aligned} \mu ^\varepsilon _n:=\sum _j \frac{N_j^\varepsilon }{n}\delta _{\tau _j}\quad \mathrm{and } \quad \mu ^\varepsilon :=\sum _j \tilde{P}_j^\varepsilon \delta _{\tau _j}, \,\mathrm{for \, any }\,\varepsilon \ge 0; \end{aligned}$$

thus (15) proves the stated convergence, conditioning on $\{ \tau _0$, $\tau _1,\tau _2,\ldots \}$, which are iid from $P_0$.

To justify the interchange of the two limits above, we must prove that the sequence $\left\{ {\mathbb {E}}\left( {\mathrm{e}}^{-\int _{}^{}fd \mu _n^{\varepsilon } } \right) , n\ge 1\right\} $ converges uniformly. To this end, it is sufficient to show that the difference between two next terms in the sequence does not depend on $\varepsilon $; in fact, for any $M>0$, since

$$\begin{aligned} |{\mathrm{e}}^{-x}-{\mathrm{e}}^{-y} | \le {\mathrm{e}}^M|x-y | \,\mathrm{for \, any } \, x, y \in [-M,M], \end{aligned}$$

we have

$$\begin{aligned}&\left| {\mathbb {E}}\left( {\mathrm{e}}^{-\int _{}^{}fd \mu _{n+1}^{\varepsilon } } \right) -{\mathbb {E}}\left( {\mathrm{e}}^{-\int _{}^{}fd \mu _n^{\varepsilon } } \right) \right| \\&\quad \le {\mathbb {E}}\left( \left| {\mathrm{e}}^{-\int _{}^{}fd \mu _{n+1}^{\varepsilon } } -{\mathrm{e}}^{-\int _{}^{}fd \mu _n^{\varepsilon } }\right| \right) \\&\quad \le {\mathrm{e}}^M{\mathbb {E}}\left( \left| \int _{}^{}fd \mu _{n+1}^{\varepsilon } -\int _{}^{}fd \mu _n^{\varepsilon } \right| \right) , \end{aligned}$$

where $M \ge \sup f$. Let now $C_{\varepsilon ,n+1}$ be a random partition on $\{1,\dots ,n+1\}$ such that its restriction to $\{1,\dots ,n\}$ corresponds to $C_{\varepsilon ,n}$. We distinguish two cases:

1.
$C_{\varepsilon ,n+1}$ has the same number of clusters of $C_{\varepsilon ,n}$; one of these clusters (the $j$-th for instance) has one more element and, as a consequence, has size equal to $n_j+1$;
2.
$C_{\varepsilon ,n+1}$ has one more cluster than $C_{\varepsilon ,n}$; this cluster contains only one element.

In both cases, it is not difficult to prove that

$$\begin{aligned} {\mathbb {E}}\left( \left| \int _{\varTheta }fd \mu _{n+1}^{\varepsilon } -\int _{}^{}fd \mu _n^{\varepsilon } \right| \right) \le \frac{2M}{n+1}, \end{aligned}$$

so that we are able to interchange the two limits. Finally, it is straightforward to show that the stated convergence follow from the convergence in distribution conditioning on $\{ \tau _0,\tau _1,\tau _2,\ldots \}$, with an argument on Laplace transforms as before. This ends the first part of the Proposition, i.e. convergence for $\varepsilon \rightarrow 0$.

Convergence as $\varepsilon \rightarrow +\infty $ is straightforward as well. In fact, when $\varepsilon $ increases to $+\infty $, there are no jumps to consider in (4) but the extra $J_0$, so that $P_\varepsilon $ degenerates on $\delta _{\tau _0}$.

1.5 Appendix 5: Proof of Proposition 2

The conditional distribution of $\varvec{\theta }$ is:

$$\begin{aligned} \begin{aligned}&\fancyscript{L}(\theta _1,\ldots ,\theta _n | P_{\varepsilon })= \prod _{i=1}^nP_{\varepsilon }(\theta _i)= \prod _{i=1}^{n} \sum _{j=0}^{N_\varepsilon } \left( P_{j}\delta _{\tau _j}(\theta _i)\right) \\&\quad =\sum _{l_1=0}^{N_\varepsilon }P_{l_1}\delta _{\tau _{l_1}}(\theta _1) \sum _{l_2=0}^{N_\varepsilon }P_{l_2}\delta _{\tau _{l_2}}(\theta _2) \dots \sum _{l_n=0}^{N_\varepsilon }P_{l_n}\delta _{\tau _{l_n}}(\theta _n) \\&\quad = \mathbb {I}_{\{1, \ldots ,N_\varepsilon +1 \}}(k)\frac{1}{(T_\varepsilon )^n}\sum _{l^*_1, \dots , l^*_k}{}J_{l^*_1}^{n_1}\\&\qquad \dots J_{l^*_k}^{n_k}\delta _{\tau _{l^*_1}}(\theta ^*_1)\dots \delta _{\tau _{l^*_k}}(\theta ^*_k), \end{aligned} \end{aligned}$$

where $(\theta ^*_1,\theta ^*_2,\dots , \theta ^*_k)$ is the vector of the unique values in the sample. We omit the indicator $ \mathbb {I}_{{\{1, \ldots ,N_\varepsilon +1 \}}}(k) $ till we need it. Introducing the auxiliary variable $U$, by (13) we have:

$$\begin{aligned}&\fancyscript{L}({\varvec{\theta }}, u| P_{\varepsilon })=\dfrac{1}{\varGamma (n)}u^{n-1} e^{-u T_\varepsilon }\\&\quad \times \sum _{l^*_1, \ldots , l^*_k}{} \left( J_{l^*_1}^{n_1} \delta _{\tau ^*_{l^*_1}}(\theta ^*_1) \dots J_{l^*_k}^{n_k}\delta _{\tau ^*_{l^*_k}} (\theta ^*_k) \right) . \end{aligned}$$

Hence we have:

$$\begin{aligned}&\fancyscript{L}({\varvec{\theta }}, u, P_{\varepsilon }) = \fancyscript{L}({\varvec{\theta }}, u| P_\varepsilon )\fancyscript{L}(P_\varepsilon ) \nonumber \\&\quad = \frac{u^{n-1}}{\varGamma (n)} e^{-u T_\varepsilon }\sum _{l^*_1, \ldots , l^*_k}{} \big ( J_{l^*_1}^{n_1} \delta _{\tau ^*_{l^*_1}}(\theta ^*_1) \ldots J_{l^*_k}^{n_k} \delta _{\tau ^*_{l^*_k}} (\theta ^*_k) \big )\fancyscript{L}(P_\varepsilon ) \nonumber \\&\quad = \frac{u^{n-1}}{\varGamma (n)}\prod _{j=0}^{N_\varepsilon }( e^{-u J_j})\sum _{l^*_1, \dots , l^*_k}{}\big ( J_{l^*_1}^{n_1} \delta _{\tau ^*_{l^*_1}}(\theta ^*_1) \ldots J_{l^*_k}^{n_k} \delta _{\tau ^*_{l^*_k}} (\theta ^*_k) \big ) \nonumber \\&\qquad \times \prod _{j=0}^{N_\varepsilon }\big (\rho _\varepsilon (J_j)P_0(\tau _j) \big ) \fancyscript{P}_0(N_\varepsilon ;\varLambda _\varepsilon ) \\&\quad =\frac{1}{\varGamma (n)}u^{n-1} \prod _{j=0}^{N_\varepsilon } \big ( e^{-u J_j}\rho _\varepsilon (J_j)P_0(\tau _j) \big ) \nonumber \\&\qquad \times \sum _{l^*_1, \dots , l^*_k}{}\big ( J_{l^*_1}^{n_1} \delta _{\tau ^*_{l^*_1}}(\theta ^*_1) \ldots J_{l^*_k}^{n_k} \delta _{\tau ^*_{l^*_k}} (\theta ^*_k) \big ) \fancyscript{P}_0(N_\varepsilon ;\varLambda _\varepsilon ) \nonumber \end{aligned}$$

(16)

where, in this proof, $\fancyscript{P}_0(N_\varepsilon ;\varLambda _\varepsilon )$ is the density of the Poisson distribution with parameter $\varLambda _\varepsilon $, evaluated in $N_\varepsilon $, and $P_0(\tau )$ is the density of $P_0$ evaluated in $\tau $.

The conditional distribution of $P_\varepsilon $, given $U=u$ and $\varvec{\theta }$, is as follows:

$$\begin{aligned} \begin{aligned} \fancyscript{L}(P_{\varepsilon }|u,\varvec{\theta })&= \fancyscript{L}({\varvec{\tau }},{\varvec{J}},N_\varepsilon |u,\varvec{\theta })\\&= \fancyscript{L}({\varvec{\tau }},{\varvec{J}}|N_\varepsilon , u,\varvec{\theta }) \fancyscript{L}(N_\varepsilon |u,\varvec{\theta }). \end{aligned} \end{aligned}$$

(17)

The second factor in the right handside is proportional to

$$\begin{aligned}&\fancyscript{L}(N_\varepsilon , u,\varvec{\theta })\\&\quad = \int d J_0 \ldots d J_{N_\varepsilon } d\tau _0\ldots d\tau _{N_\varepsilon } \fancyscript{L}({\varvec{\tau }},{\varvec{J}},N_\varepsilon , u,\varvec{\theta })\\&\quad =\sum _{l^*_1, \dots , l^*_k}{} \Biggl \{ \biggl [ \prod _{i=1}^{k}\int J_{l^*_i}^{n_i} \delta _{\tau _{l^*_i}} (\theta ^*_i) e^{-uJ_{l^*_i}}\rho _\varepsilon (J_{l^*_i})P_0(\tau _{l^*_i}) dJ_{l^*_i}d\tau _{l^*_i} \biggr ] \\&\qquad \times \biggl [ \prod _{j \ne \{ l^*_1, \ldots , l^*_k \} } \int e^{-uJ_{j}}\rho _\varepsilon (J_{j})P_0(\tau _{j}) dJ_{j}d\tau _{j} \biggr ] \Biggr \}\\&\qquad \times \frac{1}{\varGamma (n)}u^{n-1} \fancyscript{P}_0(N_\varepsilon ;\varLambda _\varepsilon ). \end{aligned}$$

Observe that, for any $j \ne \{ l^*_1, \ldots , l^*_k \} $,

$$\begin{aligned}&\int e^{-uJ_{j}}\rho _\varepsilon (J_{j})P_0(\tau _{j})dJ_{j}d\tau _{j}\nonumber \\&\quad = \int _0^{+\infty } e^{-uJ_j}\rho _{\varepsilon }(J_j)dJ_j \nonumber \\&\quad = \frac{1}{\omega ^\sigma \varGamma (-\sigma , \omega \varepsilon )}\int _0^{+\infty } x^{-\sigma -1}{\mathrm{e}}^{(u+\omega )x}\mathbb {I}_{(\varepsilon , +\infty )}(x) dx \nonumber \\&\quad = \dfrac{(\omega +u)^{\sigma }}{\omega ^{\sigma }\varGamma (-\sigma , \omega \varepsilon )} \int _{{(\omega +u)\varepsilon }}^{{+\infty }}e^{-y}y^{-\sigma -1}dy \\&\quad =\dfrac{(\omega +u)^{\sigma }\varGamma (-\sigma , (\omega +u)\varepsilon )}{\omega ^{\sigma }\varGamma (-\sigma ,\nonumber \omega \varepsilon )}. \end{aligned}$$

(18)

The integrand function in the second line of the formula above is the kernel of the mean intensity of a $\varepsilon $-NGG$(\sigma ,\kappa ,\omega +u,P_0)$ process. On the other hand, for $i=1,\ldots ,k$:

$$\begin{aligned}&\int J_{l^*_i}^{n_i} \delta _{\tau _{l^*_i}}(\theta ^*_i) {\mathrm{e}}^{-uJ_{l^*_i}}\rho _\varepsilon (J_{l^*_i}) P_0(\tau _{l^*_i})dJ_{l^*_i}d\tau _{l^*_i} \nonumber \\&\quad =\left( \int J_{l^*_i}{\mathrm{e}}^{-uJ_{l^*_i}}\rho _\varepsilon (J_{l^*_i})dJ_{l^*_i} \right) \left( \int \delta _{\tau _{l^*_i}}(\theta ^*_i) P_0(\theta ^*_i)d\theta ^*_i \right) \nonumber \\&\quad = \dfrac{ P_0(\theta ^*_i)}{\omega ^{\sigma }\varGamma (-\sigma , \omega \varepsilon )} \int _{0}^{+\infty } x^{n_i}{\mathrm{e}}^{-ux}x^{-1-\sigma }{\mathrm{e}}^{-\omega x}\mathbb {I}_{(\varepsilon ,+\infty )}dx\\&\quad =\dfrac{(\omega +u)^{\sigma -n_i}}{\omega ^{\sigma }} \dfrac{\varGamma (n_i-\sigma , (u+\omega )\varepsilon )}{\varGamma (-\sigma , \omega \varepsilon )} P_0(\theta ^*_i)\nonumber . \end{aligned}$$

(19)

The integrand function in (19) is the kernel of a gamma density with parameters $(n_i-\sigma ,u+\omega )$, restricted to $(\varepsilon ,+\infty )$. Summing up, we have

$$\begin{aligned}&\fancyscript{L}(N_\varepsilon | u,\varvec{\theta }) \propto \fancyscript{L}(N_\varepsilon , u,\varvec{\theta })= \dfrac{u^{n-1}}{\varGamma (n)} \fancyscript{P}_0(N_\varepsilon ;\varLambda _\varepsilon ) \nonumber \\&\quad \times \sum _{l^*_1, \ldots , l^*_k}{}\biggl \{ \biggl ( \dfrac{(\omega +u)^{k\sigma -n}\prod _{i=1}^{k}\varGamma (n_i-\sigma , (\omega +u)\varepsilon ) P_0(\theta _i^*)}{\omega ^{\sigma k}\varGamma (-\sigma , \omega \varepsilon )^k} \biggr ) \nonumber \\&\quad \times \biggl ( \dfrac{(\omega +u)^{\sigma (N_\varepsilon +1-k)}\varGamma (-\sigma , (u+\omega )\varepsilon )^{N_\varepsilon +1-k}}{ \omega ^{\sigma (N_\varepsilon +1-k)}\varGamma (-\sigma , \omega \varepsilon )^{N_\varepsilon +1-k}} \biggr ) \biggr \} \nonumber \\&=\dfrac{u^{n-1}}{\varGamma (n)}\fancyscript{P}_0(N_\varepsilon ;\varLambda _\varepsilon )\dfrac{(N_\varepsilon +1)!}{(N_\varepsilon +1-k)!} \dfrac{(\omega +u)^{\sigma k-n}}{\omega ^{\sigma k}\varGamma (-\sigma , \omega \varepsilon )^k} \\&\quad \times \dfrac{(\omega +u)^{\sigma N_{na}}\varGamma (-\sigma , \varepsilon (\omega +u))^{N_{na}}}{\omega ^{\sigma N_{na}} \varGamma (-\sigma , \omega \varepsilon )^ {N_{na}}}\nonumber \\&\quad \times \prod _{i=1}^{k}\biggl ( P_0(\theta _i^*)\varGamma (n_i-\sigma , \varepsilon (\omega +u)) \biggr ) \mathbb {I}_{\{(N_\varepsilon +1)\ge k\}}. \nonumber \end{aligned}$$

(20)

As in the proof of formula (5), $N_{na}=N_\varepsilon +1-k$ is the number of non-allocated jumps. Therefore, since $k$ is given, the conditional distribution $\fancyscript{L}(N_\varepsilon |$ $u,\varvec{\theta }) $ is identified by $\fancyscript{L}(N_{na}| u,\varvec{\theta })$; we have

$$\begin{aligned}&\fancyscript{L}(N_{na}| u,\varvec{\theta }) \propto \mathbb {I}_{(N_{na} \ge 0)} \dfrac{(\omega +u)^{\sigma k-n}}{\omega ^{\sigma }\varGamma (-\sigma , \omega \varepsilon )} \dfrac{(N_{na}+k)}{N_{na}!}\\&\quad \times \biggl (\dfrac{\kappa (u+\omega )^{\sigma } }{\varGamma (1-\sigma )}\varGamma (-\sigma , (u+\omega )\varepsilon ) \biggr )^{N_{na}}. \end{aligned}$$

Let $\varLambda _{\varepsilon ,u}$ be as in (6); it easily follows that

$$\begin{aligned}&\fancyscript{L}(N_{na}|\varepsilon , u, {\varvec{\theta }}) \propto \dfrac{N_{na}+k}{N_{na}!}{\mathrm{e}}^{-\varLambda _{\varepsilon ,u}} \varLambda _{\varepsilon ,u}^{N_{na}}\nonumber \\&\quad =\dfrac{\varLambda _{\varepsilon ,u}}{(N_{na}-1)!}\varLambda _{\varepsilon ,u}^{(N_{na}-1)}{\mathrm{e}}^{-\varLambda _{\varepsilon ,u}}+ \dfrac{k}{N_{na}!}\varLambda _{\varepsilon ,u}^{N_{na}}{\mathrm{e}}^{-\varLambda _{\varepsilon ,u}}\\&\quad =\dfrac{\varLambda _{\varepsilon ,u}}{\varLambda _{\varepsilon ,u}+k}\fancyscript{P}_1(N_{na};\varLambda _{\varepsilon ,u})+ \dfrac{k}{\varLambda _{\varepsilon ,u}+k}\fancyscript{P}_0(N_{na};\varLambda _{\varepsilon ,u}).\nonumber \end{aligned}$$

(21)

On the other hand, the first factor in the right handside of (17) can be computed by the following comment. Denote by ${\varvec{l}}^*=(l^*_1,\ldots ,l^*_k)$ is the vector of locations of the allocated jumps. From (16), it is clear that

$$\begin{aligned}&\fancyscript{L}({\varvec{J}},{\varvec{\tau }}, {\varvec{l}}^*|N_{na}, u,\varvec{\theta }) \nonumber \\&\quad = J_{l^*_1}^{n_1} \delta _{\tau ^*_{l^*_1}}(\theta ^*_1) \ldots J_{l^*_k}^{n_k} \delta _{\tau ^*_{l^*_k}} (\theta ^*_k) \nonumber \\&\qquad \times \prod _{j=0}^{N_{na}+k-1}\rho _\varepsilon (J_j) P_0(\tau _j){\mathrm{e}}^{-u J_j}\nonumber \\&\quad =\left( \prod _{i=1}^k J_{l^*_i}^{n_i} \delta _{\tau ^*_{l^*_i}}(\theta ^*_i) {\mathrm{e}}^{-u J_{l^*_i}} \rho _\varepsilon (J_{l^*_i} ) P_0(J_{l^*_i})\right) \nonumber \\&\qquad \times \left( \prod _{j \ne \{ l^*_1, \ldots , l^*_k \} } e^{-uJ_{j}}\rho _\varepsilon (J_{j})P_0(\tau _{j}) \right) . \end{aligned}$$

(22)

The first factor in (22) refers to the unnormalized allocated process: the support is ${\varvec{\theta }}^*$, while the jumps follows independent restricted gamma densities, as clearly observed after (19). This shows point 2. of the Proposition.

On the other hand, the second factor in (22) shows that the non-allocated jumps are indeed the jumps of $\varepsilon $-NGG($\sigma $, $\kappa $, $\omega +u$, $P_0$) process, given that exactly $N_{na}$ jumps of the process were obtained; moreover, the conditional distribution of $N_{na}$ is described in (21). This shows point 1. of the Proposition.

Point 3 follows straightforward from (22).

Normalization of the jumps (allocated and non-allocated) gives 4.

With regard to 5., we need to integrate out $N_\varepsilon $ in $\fancyscript{L}(N_\varepsilon ,u,{\varvec{\theta }})$ displayed in (20). We have already made these computations in the proof of formula (5), and thus $f_{U|\varvec{\theta }^{*}}(u|\varvec{\theta }^{*})$ is proportional to the integrand in (5).

1.6 Appendix 6: Details of the blocked Gibbs sampler

We explicitly derive every step of the Gibbs sampler in Fig. 1, starting from the joint distribution of data and parameters in (11).

1.
The first step is straightforward, since
$$\begin{aligned} \fancyscript{L}(u|{\varvec{X}},{\varvec{\theta }},P_\varepsilon ,\varepsilon ,\sigma ,\kappa ) \propto \fancyscript{L}(u, {\varvec{X}},{\varvec{\theta }},P_\varepsilon ,\varepsilon ,\sigma ,\kappa ). \end{aligned}$$
2.
Thanks to the hierarchical structure of the model, the following relation holds true:
$$\begin{aligned} \begin{aligned}&\fancyscript{L}(\varvec{\theta }|\varvec{X}, P_\varepsilon , \varepsilon , \sigma , \kappa , u)\propto \prod _{i=1}^{n}k(X_i; \theta _i)\sum _{j=0}^{N_\varepsilon }J_j\delta _{\tau _j}(\theta _i)\\&=\prod _{i=1}^{n}\sum _{j=0}^{N_\varepsilon }J_jk(X_i; \theta _i)\delta _{\tau _j}(\theta _i) =\prod _{i=1}^{n}J_jk(X_i; \tau _j), \end{aligned} \end{aligned}$$
and this proves Step 2.
3.
As far as $\fancyscript{L}(P_\varepsilon , \varepsilon , \sigma , \kappa |u, \varvec{\theta }, \varvec{X})$ is concerned, we have
$$\begin{aligned} \begin{aligned}&\fancyscript{L}(P_\varepsilon , \varepsilon , \sigma , \kappa |u, \varvec{\theta }, \varvec{X})= \fancyscript{L}(P_\varepsilon , \varepsilon , \sigma , \kappa |u, \varvec{\theta })\\&= \fancyscript{L}(P_\varepsilon |\varepsilon , \sigma , \kappa , u, \varvec{\theta }) \fancyscript{L}(\varepsilon , \sigma , \kappa | u, \varvec{\theta }), \end{aligned} \end{aligned}$$
so that Step 3. can be split into two consecutive substeps. First we simulate from $ \fancyscript{L}(\varepsilon , \sigma , \kappa | u, \varvec{\theta })$ as follows: we integrate out $N_\varepsilon $ (or equivalently $N_{na}$) from (20) and obtain
$$\begin{aligned} \begin{aligned}&\fancyscript{L}(\varepsilon , \sigma ,\kappa | u, \varvec{\theta }, \varvec{X}) \propto \sum _{N_{na}=0}^{+\infty } \fancyscript{L}(N_{na},\varepsilon , \sigma ,\kappa | u, \varvec{\theta }, \varvec{X}) \\&\quad = \dfrac{u^{n-1}}{\varGamma (n)}\left( \dfrac{\kappa }{\varGamma (1-\sigma )}\right) ^{k-1} \prod _{i=1}^{k}\biggl [\varGamma (n_i-\sigma , \varepsilon (u+\omega )) \biggr ] \\&\qquad \times \dfrac{(\omega +u)^{\sigma k-n}}{\omega ^{\sigma }\varGamma (-\sigma , \omega \varepsilon )}e^{\varLambda _{\varepsilon ,u}- \varLambda _{\varepsilon }} \left( \varLambda _{\varepsilon ,u}+k\right) \pi (\varepsilon )\pi (\sigma )\pi (\kappa ). \end{aligned} \end{aligned}$$
In practical terms, Step 3a can be obtained in three substeps:
$$\begin{aligned}&\fancyscript{L}(\varepsilon | u, \varvec{\theta }, \varvec{X})\propto \prod _{i=1}^{k}\varGamma (n_i-\sigma , \varepsilon (u+\omega )) {\mathrm{e}}^{(\varLambda _{\varepsilon ,u}-\varLambda _{\varepsilon })}\nonumber \\&\quad \times \dfrac{\varLambda _{\varepsilon ,u}+k}{\varGamma (-\sigma , \omega \varepsilon )} \pi (\varepsilon ), \end{aligned}$$
(23)

$$\begin{aligned}&\fancyscript{L}(\sigma |u, \varvec{\theta }, \varvec{X}) \propto \dfrac{(u+\omega )^{k\sigma }}{\omega ^\sigma }\dfrac{\varLambda _{\varepsilon ,u}+k}{\varGamma (-\sigma , \omega \varepsilon )} {\mathrm{e}}^{\left( \varLambda _{\varepsilon ,u}-\varLambda _\varepsilon \right) } \nonumber \\&\quad \times \prod _{i=1}^{k} \varGamma (n_i-\sigma , \varepsilon (u+\omega ))\varGamma (1-\sigma )^{1-k}\pi (\sigma ), \end{aligned}$$
(24)

$$\begin{aligned}&\fancyscript{L}(\kappa | u, \varvec{\theta }, \varvec{X})= p_1 gamma(\alpha +k, R+\beta ) \nonumber \\&\quad +\,\, (1-p_1) gamma(\alpha +k-1, R+\beta ), \end{aligned}$$
(25)
where
$$\begin{aligned} R=\dfrac{\omega ^\sigma \varGamma (-\sigma , \varepsilon \omega )}{\varGamma (1-\sigma )}-\dfrac{(\omega +u)^\sigma \varGamma (-\sigma , \varepsilon (\omega +u))}{\varGamma (1-\sigma )} \end{aligned}$$
and $p_1$ is equal to
$$\begin{aligned} \frac{(\alpha +k-1)(u+\omega )^\sigma \varGamma (-\sigma , \varepsilon (\omega +u))}{ (\alpha \!+\!k\!-\!1)(u\!+\!\omega )^\sigma \varGamma (\!-\!\sigma , \varepsilon (\omega +u))\!+\!k(R+\beta )\varGamma (1-\sigma )}. \end{aligned}$$
Here we assume that $\pi (\kappa )$ is $gamma(\alpha , \beta )$. Step 3.b consists in sampling from $\fancyscript{L}(P_\varepsilon |\varepsilon , \sigma , \kappa , u, \varvec{\theta })$ and has already been described in Sect. 4.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Argiento, R., Bianchini, I. & Guglielmi, A. A blocked Gibbs sampler for NGG-mixture models via a priori truncation. Stat Comput 26, 641–661 (2016). https://doi.org/10.1007/s11222-015-9549-6

Download citation

Received: 15 May 2014
Accepted: 29 January 2015
Published: 13 February 2015
Issue Date: May 2016
DOI: https://doi.org/10.1007/s11222-015-9549-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A blocked Gibbs sampler for NGG-mixture models via a priori truncation

Abstract

Access this article

Similar content being viewed by others

Truncated Poisson–Dirichlet approximation for Dirichlet process hierarchical models

A New Finite Approximation for the NGG Mixture Model: An Application to Density Estimation

Model-based clustering based on sparse finite Gaussian mixtures

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendices

1.1 Appendix 1: Proof of (5)

1.2 Appendix 2: Lemma 1

Lemma 1

Proof

1.3 Appendix 3: Lemma 2

Lemma 2

Proof

1.4 Appendix 4: Proof of Proposition 1

1.5 Appendix 5: Proof of Proposition 2

1.6 Appendix 6: Details of the blocked Gibbs sampler

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A blocked Gibbs sampler for NGG-mixture models via a priori truncation

Abstract

Access this article

Similar content being viewed by others

Truncated Poisson–Dirichlet approximation for Dirichlet process hierarchical models

A New Finite Approximation for the NGG Mixture Model: An Application to Density Estimation

Model-based clustering based on sparse finite Gaussian mixtures

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendices

1.1 Appendix 1: Proof of (5)

1.2 Appendix 2: Lemma 1

Lemma 1

Proof

1.3 Appendix 3: Lemma 2

Lemma 2

Proof

1.4 Appendix 4: Proof of Proposition 1

1.5 Appendix 5: Proof of Proposition 2

1.6 Appendix 6: Details of the blocked Gibbs sampler

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation