Skip to main content
Log in

Inference for cluster point processes with over- or under-dispersed cluster sizes

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Cluster point processes comprise a class of models that have been used for a wide range of applications. While several models have been studied for the probability density function of the offspring displacements and the parent point process, there are few examples of non-Poisson distributed cluster sizes. In this paper, we introduce a generalization of the Thomas process, which allows for the cluster sizes to have a variance that is greater or less than the expected value. We refer to this as the cluster sizes being over- and under-dispersed, respectively. To fit the model, we introduce minimum contrast methods and a Bayesian MCMC algorithm. These are evaluated in a simulation study. It is found that using the Bayesian MCMC method, we are in most cases able to detect over- and under-dispersion in the cluster sizes. We use the MCMC method to fit the model to nerve fiber data, and contrast the results to those of a fitted Thomas process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Andersson, C., Rajala, T., Särkkä, A.: A Bayesian hierarchical point process model for epidermal nerve fiber patterns. Math. Biosci. 313, 48–60 (2019). https://doi.org/10.1016/j.mbs.2019.04.010

  • Baddeley, A., Rubak, E., Turner, R.: Spatial Point Patterns: Methodology and Applications with R. Chapman and Hall/CRC Press, London (2015)

    Book  Google Scholar 

  • Consul, P.C., Jain, G.C.: A generalization of the Poisson distribution. Technometrics 15(4), 791–799 (1973)

    Article  MathSciNet  Google Scholar 

  • Diggle, P.J., Gratton, R.J.: Monte Carlo methods of inference for implicit statistical models. J. R. Stat. Soc. Ser. B 46(2), 193–227 (1984)

    MathSciNet  MATH  Google Scholar 

  • Dvořák, J., Prokešová, M.: Moment estimation methods for stationary spatial Cox processes—a comparison. Kybernetika 48(5), 1007–1026 (2012)

    MathSciNet  MATH  Google Scholar 

  • Eddelbuettel, D.: Seamless R and C++ Integration with Rcpp. Springer, New York (2013)

    Book  Google Scholar 

  • Eddelbuettel, D., François, R.: Rcpp: seamless R and C++ integration. J. Stat. Softw. 40(8), 1–18 (2011)

    Article  Google Scholar 

  • Guan, Y.: A composite likelihood approach in fitting spatial point process models. J. Am. Stat. Assoc. 101(476), 1502–1512 (2006)

    Article  MathSciNet  Google Scholar 

  • Guttorp, P., Thorarinsdottir, T.L.: Bayesian inference for non-markovian point processes. In: Porcu, E., Montero, J.-M., Schlather, M. (eds.) Advances and Challenges in Space-time Modelling of Natural Events, pp. 79–102. Springer, New York (2012)

    Chapter  Google Scholar 

  • Illian, J., Penttinen, A., Stoyan, H., Stoyan, D.: Statistical Analysis and Modelling of Spatial Point Patterns. Wiley, New York (2008)

    MATH  Google Scholar 

  • Joe, H., Zhu, R.: Generalized poisson distribution: the property of mixture of poisson and comparison with negative binomial distribution. Biom. J. 47(2), 219–229 (2005)

    Article  MathSciNet  Google Scholar 

  • Lund, A.: Spatio-temporal modeling of neuron fields, Ph.D. thesis, Copenhagen. ISBN: 978-87-7078-934-9 (2017)

  • Matérn, B.: Spatial variation, Meddelanden från statens skogsforskningsinstitut 4(5). Statens Skogsforskningsinstitut, Stockholm (1960)

  • Matérn, B.: Spatial Variation, Volume 36 of Lecture Notes in Statistics (1986)

  • Milne, R., Westcott, M.: Further results for Gauss–Poisson processes. Adv. Appl. Probab. 4(1), 151–176 (1972)

    Article  MathSciNet  Google Scholar 

  • Møller, J.: Shot noise Cox processes. Adv. Appl. Probab. 35(3), 614–640 (2003). https://doi.org/10.1239/aap/1059486821

    Article  MathSciNet  MATH  Google Scholar 

  • Møller, J., Torrisi, G.L.: Generalised shot noise Cox processes. Adv. Appl. Probab. 37(1), 48–74 (2005)

    Article  MathSciNet  Google Scholar 

  • Møller, J., Waagepetersen, R.P.: Statistical Inference and Simulation for Spatial Point Processes. CRC Press, Boca Raton (2003)

    Book  Google Scholar 

  • Mrkvička, T.: Distinguishing different types of inhomogeneity in Neyman–Scott point processes. Methodol. Comput. Appl. Probab. 16(2), 385–395 (2014)

    Article  MathSciNet  Google Scholar 

  • Mrkvička, T., Soubeyrand, S.: On parameter estimation for doubly inhomogeneous cluster point processes. Spat. Stat. 20, 191–205 (2017)

    Article  MathSciNet  Google Scholar 

  • Mrkvička, T., Muška, M., Kubečka, J.: Two step estimation for Neyman–Scott point process with inhomogeneous cluster centers. Stat. Comput. 24(1), 91–100 (2014)

    Article  MathSciNet  Google Scholar 

  • Mrkvička, T., Myllymäki, M., Hahn, U.: Multiple Monte Carlo testing, with applications in spatial point processes. Stat. Comput. 27(5), 1239–1255 (2017). https://doi.org/10.1007/s11222-016-9683-9

    Article  MathSciNet  MATH  Google Scholar 

  • Myllymäki, M., Mrkvicka, T., Grabarnik, P., Seijo, H., Hahn, U.: Global envelope tests for spatial processes. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 79, 381–404 (2017). https://doi.org/10.1111/rssb.12172

    Article  MathSciNet  MATH  Google Scholar 

  • Neyman, J., Scott, E.L.: Statistical approach to problems of cosmology. J. R. Stat. Soc. Ser. B (Methodol.) 1–43 (1958)

  • Neyman, J., Scott, E.: A theory of the spatial distribution of galaxies. Astrophys. J. 116, 144–163 (1952)

    Article  MathSciNet  Google Scholar 

  • R Core Team: R: a Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2017). https://www.R-project.org/

  • Ripley, B.D.: The second-order analysis of stationary point processes. J. Appl. Probab. 13(2), 255–266 (1976)

    Article  MathSciNet  Google Scholar 

  • Robert, Ch., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2004)

    Book  Google Scholar 

  • Scollnik, D.P.: On the analysis of the truncated generalized Poisson distribution using a Bayesian method. ASTIN Bull. J. IAA 28(1), 135–152 (1998)

    Article  Google Scholar 

  • Tanaka, U., Ogata, Y., Stoyan, D.: Parameter estimation and model selection for Neyman–Scott point processes. Biom. J. 50(1), 43–57 (2008)

    Article  MathSciNet  Google Scholar 

  • Thomas, M.: A generalization of Poisson’s binomial limit for use in ecology. Biometrika 36, 18–25 (1949)

    Article  MathSciNet  Google Scholar 

  • van Lieshout, M., Baddeley, A.J.: Extrapolating and interpolating spatial patterns. In: Lawson, A., Denison, D. (eds.) Spatial Cluster Modelling, pp. 61–86. Chapman and Hall/CRC, Boca Raton (2001)

    Google Scholar 

  • Waagepetersen, R., Schweder, T.: Likelihood-based inference for clustered line transect data. J. Agric. Biol. Environ. Stat. 11(3), 264 (2006)

    Article  Google Scholar 

  • Wang, W., Famoye, F.: Modeling household fertility decisions with generalized Poisson regression. J. Popul. Econ. 10(3), 273–283 (1997)

    Article  Google Scholar 

  • Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer, New York (2016)

    Book  Google Scholar 

Download references

Acknowledgements

The authors thank Aila Särkkä for her many suggestions and comments, which greatly improved the manuscript. The authors also thank William R. Kennedy and Gwen Wendelschafer-Crabb (University of Minnesota) for blister immunostaining, quantification and morphometry of the ENF data, and for helpful discussions. The research has been supported financially by the Swedish Research Council (VR 2013-5212) and by the Grant Agency of Czech Republic (Project No. 19-04412S).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomáš Mrkvička.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Parameter combinations giving identical intensity and K-function

Recall that for the TGCS

$$\begin{aligned} \rho&= \kappa \frac{\theta }{1-\lambda } \quad \quad \text { and } \\ K(r)&= \pi r^2 + \frac{1 + \frac{\lambda }{\theta }\left( 1-\frac{1}{1-\lambda }\right) }{\kappa }\left[ 1-\exp \left( -\frac{r^2}{4\omega ^2}\right) \right] . \end{aligned}$$

Clearly, for two TGCSs to have the same K-function, their kernel standard deviations must be equal. Now assume that \(Y_0 \sim \text{ TGCS }( \kappa _0, \omega , \lambda _0, \theta _0)\) and \(Y_1 \sim \text{ TGCS }( \kappa _1, \omega , \lambda _1, \theta _1)\), with

$$\begin{aligned} \theta _1&= \left( \theta _0 + \lambda _0\left( 1+\frac{1}{1-\lambda _0}\right) \right) \frac{1-\lambda _1}{1-\lambda _0}-\lambda _1\left( 1+\frac{1}{1-\lambda _1}\right) \end{aligned}$$
(11)
$$\begin{aligned} \kappa _1&= \frac{\theta _0}{\theta _1}\frac{1-\lambda _1}{1-\lambda _0}\kappa _0. \end{aligned}$$
(12)

To show that \(Y_0\) and \(Y_1\) have the same intensity and K-function, we need to show that \(\rho _0 = \rho _1\), where \(\rho _i = \kappa _i \frac{\theta _i}{1-\lambda _i}\), \(i = 0,1\), and \(C_0 = C_1\), where

$$\begin{aligned} C_i = \frac{1 + \frac{\lambda _i}{\theta _i}\left( 1-\frac{1}{1-\lambda _i}\right) }{\kappa _i}, \quad i = 0,1. \end{aligned}$$

Starting with the intensity we have that

$$\begin{aligned} \rho _1&= \frac{\theta _1}{1-\lambda _1}\kappa _1 = \frac{\theta _1}{1-\lambda _1} \frac{\theta _0}{\theta _1} \frac{1-\lambda _1}{1-\lambda _0}\kappa _0 = \frac{\theta _0}{1-\lambda _0}\kappa _0 = \rho _0 \end{aligned}$$

To show that \(C_0 = C_1\) we let \(\rho = \rho _0 = \rho _1\) and start from (11):

$$\begin{aligned}&\theta _1 = \left( \theta _0 + \lambda _0\left( 1+\frac{1}{1-\lambda _0}\right) \right) \frac{1-\lambda _1}{1-\lambda _0}-\lambda _1\left( 1+\frac{1}{1-\lambda _1}\right) \Leftrightarrow \\&\frac{\theta _1 + \lambda _1\left( 1+ \frac{1}{1-\lambda _1}\right) }{1-\lambda _1}= \frac{\theta _0 + \lambda _0\left( 1+ \frac{1}{1-\lambda _0}\right) }{1-\lambda _0} \Leftrightarrow \\&\frac{1 + \frac{\lambda _1}{\theta _1}\left( 1+ \frac{1}{1-\lambda _1}\right) }{\frac{(1-\lambda _1)\rho }{\theta _1}} = \frac{1 + \frac{\lambda _0}{\theta _0}\left( 1+ \frac{1}{1-\lambda _0}\right) }{\frac{(1-\lambda _0)\rho }{\theta _0}} \Leftrightarrow \\&\frac{1 + \frac{\lambda _1}{\theta _1}\left( 1+ \frac{1}{1-\lambda _1}\right) }{\kappa _1} = \frac{1 + \frac{\lambda _0}{\theta _0}\left( 1+ \frac{1}{1-\lambda _0}\right) }{\kappa _0} \Leftrightarrow \\&C_1 = C_0, \end{aligned}$$

where all the lines are equivalent to each other.

MCMC algorithm

We let \({\mathbf {y}}\) denote the observed pattern and \({\mathbf {x}}\) the current parent pattern and their point counts \(N({\mathbf {y}})\) and \(N({\mathbf {x}})\), respectively. Recall that \(\kappa \) is the intensity of the parent process, \(\omega \) the standard deviation of the Gaussian kernel for the offspring displacements and \(\lambda \) and \(\theta \) are the parameters to the GPD for the cluster size distribution. The priors for \(\kappa \), \(\omega \) and \(\theta \) are gamma-distributions, and we let \(\alpha _\cdot \) and \(\beta _\cdot \) denote the shape and rate parameters to the priors, respectively, where the subindex indicates which parameter the prior is for. Furthermore, let \({\mathbf {p}}(y) = \{p_y\}_{y\in Y}\) be the connections, i.e. \(p_y = x\) means that offspring y belongs to parent \(x \in {\mathbf {x}}\).

1.1 Full conditional density

The full conditional density of the unobserved entities is then given by

$$\begin{aligned} \begin{aligned} p( {\mathbf {x}}, \kappa , \omega , \lambda , \theta , {\mathbf {p}}(y)| {\mathbf {y}} ) \propto&p( {\mathbf {y}}| {\mathbf {x}}, {\mathbf {p}}(y), \omega ) p( {\mathbf {p}}(y)| \omega , \lambda , \theta , {\mathbf {x}})\\ {}&p( {\mathbf {x}}| \kappa )p( \kappa )p( \omega )p(\lambda ) p( \theta ), \end{aligned} \end{aligned}$$
(13)

up to a normalizing constant.

The first factor is given by the product of the bivariate Gaussian density at the distances between offspring and their respective parents, i.e

$$\begin{aligned} p( {\mathbf {y}}| {\mathbf {x}}, {\mathbf {p}}(y), \omega ) {=} (2\pi \omega ^2)^{-N({\mathbf {y}})} \exp \left( -\frac{1}{2\omega ^2}\sum _{y\in {\mathbf {y}}}{||y - p_y||^2} \right) \end{aligned}$$
(14)

Moreover, since \(X|\kappa \) is a Poisson process with intensity \(\kappa \), the density with respect to the unit rate Poisson process is given by

$$\begin{aligned} p( {\mathbf {x}}|\kappa ) = \kappa ^{N({\mathbf {x}})}\exp \left( |W|(1-\kappa ) \right) , \end{aligned}$$
(15)

and the last four densities are the priors for the respective parameters. The conditional density \(p({\mathbf {p}}(y)|\omega , \lambda , \theta , {\mathbf {x}})\) depends on the number of offspring of each parent generated by the connections \({\mathbf {p}}(y)\), and takes a little more work to derive.

First, let \({\mathbf {o}}(y) = \{o_x(y)\}_{ x\in {\mathbf {x}}}\) be the observed cluster sizes. Note that we can obtain \({\mathbf {o}}(y)\) from \({\mathbf {p}}(y)\) by computing, for each x,

$$\begin{aligned} o_x(y) = \sum _{y \in {\mathbf {y}}} {\mathbf {1}}\{ p_y = x \}, \end{aligned}$$

where \({\mathbf {1}}\) is the indicator function. Having \(o_x(y) = k\) means that parent x has k offspring in the observation window, W, but that it could have more offspring that are not observed. We denote the total number of offspring per parent by \({\mathbf {o}}(y)^* = \{O^*_x\}_{ x\in {\mathbf {x}}}\). The number of offspring per parent follows a generalized Poisson distribution, i.e. \(o_x(y)^* \sim \text{ GPD }( \lambda , \theta )\), and the probability that an offspring of a parent x is inside the observation window is \(\nu _W(x; \omega )\), the integral of the kernel centered at x over the observation window is denoted by \(\nu _W( x; \omega )\), i.e

$$\begin{aligned} \nu _W(x;\omega ) = \int _W \frac{1}{2\pi \omega ^2} \exp \left( -\frac{||x-z||^2}{2\omega ^2}\right) dz. \end{aligned}$$

Thus, by the independence of the offspring displacements, we get that the probability that a parent x has o offspring inside W is

$$\begin{aligned}&P(o_x(y) = o| x, \lambda , \theta ,\omega ) \\&\quad =\sum _{k = o}^\infty {k \atopwithdelims ()o}\nu _W(x;\omega )^{o} (1-\nu _W(x;\omega ))^{k-o} p_{\text{ GPD }}( k| \lambda , \theta ), \end{aligned}$$

where \(p_{\text{ GPD }}(\cdot ; \lambda , \theta )\) is the PMF of the GPD. When computed, this sum is evaluated numerically, by truncating it at a value of k such that \(p_{\text{ GPD }}(k; \lambda , \theta ) < 10^{-6}\). The probability of observing a certain connection configuration, \({\mathbf {p}}(y)\), given a parent pattern \({\mathbf {x}}\), but not considering the displacements of the offspring, is given by the probability of the cluster sizes, \({\mathbf {o}}(y)\), generated by \({\mathbf {p}}(y)\), divided by the number of possible configurations with the same \({\mathbf {o}}(y)\), i.e.

$$\begin{aligned} \begin{aligned} P( {\mathbf {p}}(y)&= {\mathbf {p}}(y) | \omega , \lambda , \theta , {\mathbf {x}} ) =\\&= \frac{\prod _{x \in {\mathbf {x}}} o_x(y)!}{\left( \sum _{x\in {\mathbf {x}}} o_x(y)\right) !} \prod _{x\in {\mathbf {x}}} P( o_x(y) = o_x(y) | \omega , \lambda , \theta ). \end{aligned} \end{aligned}$$
(16)

1.2 Updating scheme

For many proposals we use the log-normal distribution. We say that \(Z \sim \log {\mathcal {N}} (\mu , \sigma )\) if the density is given by

$$\begin{aligned} f_Z(z) = \frac{1}{\sqrt{2\pi }\sigma z}\exp \left( -\frac{(\log z - \mu )^2}{2\sigma ^2} \right) . \end{aligned}$$

All updates are performed using a Metropolis-Hastings step.

1.3 Updating \(\kappa \)

The full conditional density of \(\kappa \) is given by

$$\begin{aligned} p( \kappa |\ldots ) \propto p( {\mathbf {x}}| \kappa ) p( \kappa ), \end{aligned}$$

and given the current value, \(\kappa ^{(i)}\), the proposal, \(\kappa ^*\), is log-normal with mean \(\log \kappa ^{(i)}\) and standard deviation \(\sigma _\kappa \). This, together with the prior, gives the Hastings ratio as

$$\begin{aligned} r_\kappa (\kappa ^*|\kappa ^{(i)})&= \left( \frac{\kappa ^*}{\kappa ^{(i)}} \right) ^{N({\mathbf {x}})+1}\exp \left( -|W|(\kappa ^*-\kappa ^{(i)})\right) \times \\&\left( \frac{\kappa ^*}{\kappa ^{(i)}}\right) ^{\alpha _\kappa -1}\exp \left( -\beta _\kappa (\kappa ^*-\kappa ^{(i)})\right) . \end{aligned}$$

1.4 Updating \(\omega \)

Given the current value, \(\omega ^{(i)}\), the proposal, \(\omega ^*\), is log-normal, with mean \(\log \omega ^{(i)} \) and standard deviation \(\sigma _\omega \). The full conditional density of \(\omega \) is given by

$$\begin{aligned} p( \omega | \ldots ) \propto p( {\mathbf {y}}| {\mathbf {x}}, {\mathbf {p}}(y), \omega ) p( {\mathbf {p}}(y)|{\mathbf {x}}, \omega , \lambda , \theta )p(\omega ). \end{aligned}$$

Given the prior and the update proposal density we get the Hastings-ratio for \(\omega \) as

$$\begin{aligned} r_\omega ( \omega ^*|\omega ^{(i)})&= \exp \left[ {-}\left( \frac{1}{2(\omega ^*)^2}{-}\frac{1}{2(\omega ^{(i)})^2}\right) \sum _{y\in {\mathbf {y}}}||y{-}p_y||^2\right] \times \\&\left( \frac{\omega ^{(i)}}{\omega ^*}\right) ^{2N({\mathbf {y}}) {-} \alpha _\omega }\prod _{x\in {\mathbf {x}}} \frac{P(o_x(y)|x, \lambda , \theta \omega ^*)}{P(o_x(y)| x, \lambda , \theta , \omega ^{(i)})} \times \\&\exp [ -\beta _\omega (\omega ^* - \omega ^{(i)})]. \end{aligned}$$

1.5 Updating \(\lambda \) and \(\theta \)

In this step we accept or reject \(\lambda \) and \(\theta \) jointly. The full conditional density of the pair \((\lambda , \theta )\) is given as

$$\begin{aligned} p( \lambda , \theta | \ldots )\propto p( {\mathbf {o}}(y)| \lambda , \theta , X, \omega )p(\lambda )p(\theta ). \end{aligned}$$

Given the current values \(\theta ^{(i)}\) and \(\lambda ^{(i)}\), the proposal \(\theta ^*\) is log-normal with mean \(\log \theta ^{(i)}\) and standard deviation \(\sigma _\theta \). The proposal \(\lambda ^*\), given \(\lambda ^{(i)}\) and \(\theta ^*\), is uniform on \([ l^*, u^*]\), where the limits are given by

$$\begin{aligned} l^*&= \max \left\{ -1, \lambda ^{(i)} - \varDelta \lambda , - \frac{\theta ^*}{\max _{x\in {\mathbf {x}}} o_x(y)}\right\} \\ u^*&= \min \left\{ 0.99, \lambda ^{(i)} + \varDelta \lambda \right\} , \end{aligned}$$

where \(\varDelta \lambda \) is a chosen constant. We denote the proposal density by \(q_\lambda ( \lambda ^*| \lambda ^{(i)}, \theta ^*)\). These limits are chosen to assure that we do not propose a value that is outside the support of the prior (recall that the prior for \(\lambda \) is uniform on \([-1, 0.99]\)) and that the current cluster sizes are possible with the proposed parameter combination \((\lambda ^*, \theta ^*)\). In the same way, we let

$$\begin{aligned} l&= \max \left\{ -1, \lambda ^{*} - \varDelta \lambda , - \frac{\theta ^{(i)}}{\max _{x\in X} o_x(y)}\right\} \\ u&= \min \left\{ 0.99, \lambda ^{*} + \varDelta \lambda \right\} , \end{aligned}$$

i.e. the limits for the proposal interval for \(q_{\lambda }(\lambda ^{(i)}|\lambda ^*, \theta ^{(i)})\). Of course, it could happen that for the proposed \(\theta ^*\) we get \(u^*\le l^*\), in which case we immediately reject. From this we get that the Hastings-ratio for the pair \((\lambda , \theta )\) is given by

$$\begin{aligned} r_{(\lambda , \theta )}((\lambda ^*, \theta ^*)|(\lambda ^{(i)}, \theta ^{(i)}))&= \frac{\prod _{x\in {\mathbf {x}}} p(o_x(y)|\lambda , \theta ^*, \omega ^*)}{\prod _{x\in {\mathbf {x}}} p(o_x(y)|\lambda , \theta ^{(i)}, \omega ^{(i)})} \\&\quad \times \exp \left( -\beta _\theta (\theta ^* -\theta ^{(i)})\right) \\&\quad \times \left( \frac{\theta ^*}{\theta ^{(i)}}\right) ^{\alpha _\theta } \frac{u^*-l^*}{u-l}. \end{aligned}$$

1.6 Updating \({\mathbf {p}}(y)\)

The full conditional density of \({\mathbf {p}}(y)\) is given by

$$\begin{aligned} p( {\mathbf {p}}(y)| \ldots ) \propto&p( {\mathbf {y}}| {\mathbf {x}}, \omega , {\mathbf {p}}(y) )p( {\mathbf {p}}(y)| {\mathbf {x}}, \omega , \lambda , \theta ), \end{aligned}$$

which is according to the (13) and (16) equal

$$\begin{aligned}&=(2\pi \omega ^2)^{-N({\mathbf {y}})} \exp \left( -\frac{1}{2\omega ^2}\sum _{y \in {\mathbf {y}}}||y - {p_y}||^2 \right) \\&\quad \frac{o_1! \cdot \ldots \cdot o_{N({\mathbf {x}})}!}{N({\mathbf {y}})!} \prod _{x \in {\mathbf {x}}} p( o_x(y)|x, \omega , \lambda , \theta ). \end{aligned}$$

To update \({\mathbf {p}}(y)\) we chose a point, \(y'\), of \({\mathbf {y}}\) uniformly and propose a new parent for it, uniformly over the current parent pattern. We denote by \(p_{y'}^*\) the proposed parent, and by \(o_{p_{y'}}\) and \(o_{p_{y'}^*}\) the current offspring counts of the current parent and the proposed parent, respectively. Moreover, we let \(o_{p_{y'}}^*\) and \(o_{p_{y'}^*}^*\) be the same offspring counts with the proposed parent for \(y'\), i.e. \(o_{p_{y'}}^* = o_{p_{y'}}-1\) and \(o_{p_{y'}^*}^* =o_{p_{y'}^*}+1\). With this in place, we get the Hastings ratio for updating \({\mathbf {p}}(y)\) as

$$\begin{aligned} r_{{\mathbf {p}}(y)}&( {\mathbf {p}}(y)^*| {\mathbf {p}}(y)) = \exp \left( -\frac{||y'-p_{y'}^*||^2-||y'-p_{y'}||^2}{2\omega ^2}\right) \times \\&\frac{p\left( o_{p_{y'}^*}^*|{\mathbf {x}}, \omega , \lambda , \theta \right) p\left( o_{p_{y'}}^*|{\mathbf {x}}, \omega , \lambda , \theta \right) }{p\left( o_{p_{y'}^*}|{\mathbf {x}}, \omega , \lambda , \theta \right) p\left( o_{p_{y'}}|{\mathbf {x}}, \omega , \lambda , \theta \right) } \times \frac{o_{p_{y'}^*}^*!o_{p_{y'}}^*!}{o_{p_{y'}^*}!o_{p_{y'}}!}. \end{aligned}$$

1.7 Updating \({\mathbf {x}}\)

When updating \({\mathbf {x}}\) by a birth (adding a point) or death (removing a point), we must also update \({\mathbf {p}}(y)\). Both birth and death proposals are uniform, i.e. the location of a new parent is uniform on W and the parent to remove at a death is uniform on \({\mathbf {x}}\).

At a birth, we propose a new set of connections, \({\mathbf {p}}(y)^*\), by proposing offspring to “steal” from other parents. Given that we propose to add a parent, \(x^*\), the probability that \(y \in {\mathbf {y}}\) is stolen depends on a Gaussian kernel at the distances between parent and offspring with the current and updated connection, as

$$\begin{aligned} P(P_y^* = x^* ) = \frac{\exp \left( -\frac{||y-x^*||^2}{2\sigma _{bd}^2}\right) }{\exp \left( -\frac{||y-x^*||^2}{2\sigma _{bd}^2}\right) +\exp \left( -\frac{||y-p_y||^2}{2\sigma _{bd}^2}\right) } := b_y(x^*), \end{aligned}$$

where \(\sigma _{bd}\) is a tuning parameter.

When a point, \(x_0 \in {\mathbf {x}}\), is proposed for a death, the probability that \(y\in {\mathbf {y}}\) is instead assigned to \(x'\) is given by

$$\begin{aligned} P( P_y^* = x' ) = \frac{\exp \left( -\frac{||y-x'||^2}{2\sigma _{bd}^2}\right) }{\sum _{x\in {\mathbf {x}}\setminus x_0} \exp \left( -\frac{||y-x||^2}{2\sigma _{bd}^2}\right) } := d_y( x'), \end{aligned}$$

where we use the same standard deviation for the Gaussian kernel as in the birth updates.

With this notation we get the following proposal density for a birth

$$\begin{aligned} q_b( {\mathbf {x}}^*, {\mathbf {p}}(y)^*| {\mathbf {x}}, {\mathbf {p}}(y) ) = |W|^{-1} \prod _{y:p_y^* = x^*} b_y( x^* ) \prod _{y: p_y^* \ne x^*} (1-b_y(x^*)), \end{aligned}$$

where \({\mathbf {x}}^* = {\mathbf {x}}\cup x^*\) and \({\mathbf {p}}(y)^*\) is obtained by proposing offspring to steal. The proposal density for a death is given by

$$\begin{aligned} q_d( {\mathbf {x}}^*, {\mathbf {p}}(y)^*| {\mathbf {x}}, {\mathbf {p}}(y) ) =N({\mathbf {x}})^{-1} \prod _{y:p_y = x_0} d_y( p_y^* ), \end{aligned}$$

where \({\mathbf {x}}^* = {\mathbf {x}}\setminus x_o\) and \({\mathbf {p}}(y)^*\) is obtained by reassigning the current offspring of \(x_0\) according to the above. This yields the Hastings-ratios for a proposed birth update, where the dimensions in products change, as

$$\begin{aligned}&r_{birth}( {\mathbf {x}}^*, {\mathbf {p}}(y)^*| {\mathbf {x}}, {\mathbf {p}}(y)) \\&\quad =\exp \left( -\frac{\sum _{y: p^*_y = x^*}(||y - x^*||^2 - ||y-p_y||^2)}{2\omega ^*}\right) \\&\qquad \times \frac{\prod _{x\in {\mathbf {x}}^*}p(o_x(y)^*|x, \lambda , \theta , \omega )o_x(y)^*!}{\prod _{x\in {\mathbf {x}}}p(o_x(y)|x, \lambda , \theta , \omega )o_x(y)!}\\&\qquad \times \kappa \frac{q_d({\mathbf {x}}, {\mathbf {p}}(y)|{\mathbf {x}}^*, {\mathbf {p}}(y)^* ) }{q_b({\mathbf {x}}^*, {\mathbf {p}}(y)^*|{\mathbf {x}}, {\mathbf {p}}(y) )}, \end{aligned}$$

and for a proposed death update as

$$\begin{aligned}&r_{death}( {\mathbf {x}}^*, {\mathbf {p}}(y)^*| {\mathbf {x}}, {\mathbf {p}}(y) ) \\&\quad =\exp \left( -\frac{1}{2\omega ^2}\sum _{y: p_y = x_0}(||y - p_y^*||^2 - ||y-x_0||^2)\right) \\&\qquad \times \frac{\prod _{x\in {\mathbf {x}}^*}p(o_x(y)^*|x, \lambda , \theta , \omega )o_x(y)^*!}{\prod _{x\in {\mathbf {x}}}p(o_x(y)|x, \lambda , \theta , \omega )o_x(y)!} \\&\qquad \times \kappa ^{-1} \frac{q_b({\mathbf {x}}, {\mathbf {p}}(y)|{\mathbf {x}}^*, {\mathbf {p}}(y)^* ) }{q_d({\mathbf {x}}^*, {\mathbf {p}}(y)^*|{\mathbf {x}}, {\mathbf {p}}(y) )}. \end{aligned}$$

When proposing a move of a point, x, of the current parent pattern, we propose a new location, \(x^*\), according to a Gaussian kernel with standard deviation \(\sigma _m\) around x restricted to the observation window, i.e.

$$\begin{aligned} q_{move}( x^*|x) \propto \frac{\exp \left( -\frac{1}{2\sigma _m^2}||x^*-x||^2 \right) }{\nu _W(x; \sigma _m)}{\mathbf {1}}\{ x^* \in W \}, \end{aligned}$$

and since only one point of the pattern is proposed to be moved, the Hastings ratio is given by

$$\begin{aligned}&r_{move}( x^*|x ) = \exp \left( -\frac{1}{2\omega ^2}\sum _{y:p_y = x}(||y-x^*||^2-||y-x||^2)\right) \\&\quad \frac{\nu _W(x;\sigma _m)}{\nu _W(x^*;\sigma _m)}. \end{aligned}$$

1.8 Irreducibility of Markov chain

A Markov chain is said to be irreducible if it is possible to get to any state from any state. Irreducibility of x, \(\kappa \) and \(\omega \) are straightforward. The problem is in that for negative \(\lambda \) not all configurations of \({\mathbf {p}}(y)\) are accessible. But there is a positive probability that the Markov chain gets to the state of \(\lambda \) and \(\theta \) from which the desired configuration of \({\mathbf {p}}(y)\) is already accessible. The same holds for state \(\lambda \) and \(\theta \) which are inaccessible from certain configuration of \({\mathbf {p}}(y)\). Thus the whole Markov chain is irreducible.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Andersson, C., Mrkvička, T. Inference for cluster point processes with over- or under-dispersed cluster sizes. Stat Comput 30, 1573–1590 (2020). https://doi.org/10.1007/s11222-020-09960-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-020-09960-8

Keywords

Navigation