4.4.1 As Approximation of the Binomial Model
Besides the approximation by a Poisson distribution, it is well-known that a Binomial model can also be approximated by a Gaussian one under suitable circumstances. Let us start with the Bernoulli model (4.4) and suppose that all \(Y_{\mathbf {x},i}\) are independent with \(p_{\mathbf {x},i} \equiv p_{\mathbf {x}}\). If we are interested in the total number of counts \(Y_{\mathbf {x}}:= \sum _{i=1}^n Y_{\mathbf {x},i}\) in bin \(\mathbf {x}\), the de Moivre-Laplace theorem states that
$$\begin{aligned} \frac{Y_{\mathbf {x}} - n p_{\mathbf {x}}}{\sqrt{np_{\mathbf {x}}\left( 1-p_{\mathbf {x}}\right) }} \longrightarrow Z\quad \text {as}\quad n \rightarrow \infty \end{aligned}$$
(4.8)
in distribution, where \(Z \sim \mathcal N \left( 0,1\right) \) follows a standard normal distribution. Note that \(\frac{Y_{\mathbf {x}} - n p_{\mathbf {x}}}{\sqrt{np_{\mathbf {x}}\left( 1-p_{\mathbf {x}}\right) }}\) is just the centered and standardized version of the total number of counts \(Y_{\mathbf {x}}\). This implies that the distribution of \(Y_{\mathbf {x}}\) can be approximated by a Gaussian distribution with mean \(n p_{\mathbf {x}}\) and variance \(np_{\mathbf {x}}\left( 1-p_{\mathbf {x}}\right) \) if n is sufficiently large. This gives rise to a first Gaussian model:
Gaussian model I
For each \(\mathbf {x}\in \varXi \), the number of photons observed in the bin centered at \(\mathbf {x}\) up to time T is
$$\begin{aligned} Y_{\mathbf {x}} \sim \mathcal N \left( n p_{\mathbf {x}}, np_{\mathbf {x}}\left( 1-p_{\mathbf {x}}\right) \right) \end{aligned}$$
(4.9)
where \(n = n(T) \sim T/\delta \) with the length \(\delta \) of the individual time frames.
The rate of convergence in (4.8) can be made more precise. For instance a special case of the Berry-Esseen theorem states
$$\begin{aligned} \sup \limits _{y \in \mathbb R} \left| \mathbb {P}\left[ \frac{Y_{\mathbf {x}} - n p_{\mathbf {x}}}{\sqrt{np_{\mathbf {x}}\left( 1-p_{\mathbf {x}}\right) }} \le y\right] - \varPhi (y) \right| < \frac{\sqrt{10} + 3}{6 \sqrt{2\pi }} \frac{p_{\mathbf {x}}^2 + \left( 1-p_{\mathbf {x}}\right) ^2}{\sqrt{n p_{\mathbf {x}}\left( 1-p_{\mathbf {x}}\right) }} \end{aligned}$$
(4.10)
where \(\varPhi \) denotes the distribution function of \(\mathcal N \left( 0, 1\right) \), i.e.,
$$\begin{aligned} \varPhi (y) = \frac{1}{\sqrt{2\pi }}\int _{-\infty }^y \exp \left( -\frac{x^2}{2}\right) \,\mathrm {d}x. \end{aligned}$$
In fact, the constant on the right-hand side of (4.10) cannot be improved [22]. An interpretation of this theorem is that the approximation leading to the model (4.9) is reasonable as soon as \(n p_{\mathbf {x}}\left( 1-p_{\mathbf {x}}\right) >9\), which implies the right-hand side of (4.10) to be bounded by \(\frac{\sqrt{10} + 3}{18 \sqrt{2\pi }} \approx 0.137\).
If the success probabilities \(p_{\mathbf {x}, i}\) do vary in i, the de Moivre-Laplace theorem (4.8) cannot be applied immediately. However, it is still possible, under certain conditions, to derive an approximate Gaussian model of the form (4.9) by applying the Lindeberg central limit theorem (see e.g. [23]). It states that the sum \(Y_{\mathbf {x}}\), after centralization and standardization, still converges to \(\mathcal N \left( 0,1\right) \) in distribution even for non identically distributed \(Y_{\mathbf {x}, i}\). This motivates a second Gaussian model:
Gaussian model II
For each \(\mathbf {x}\in \varXi \), the number of photons observed in the bin centered at \(\mathbf {x}\) up to time T is
$$\begin{aligned} Y_{\mathbf {x}} \sim \mathcal N \left( \sum \limits _{i=1}^n p_{\mathbf {x},i}, \sum \limits _{i=1}^n p_{\mathbf {x},i}\left( 1-p_{\mathbf {x},i}\right) \right) . \end{aligned}$$
(4.11)
Note that, if the random variables \(Y_{\mathbf {x},i}\) are dependent, the type of dependency very much determines whether a central limit theorem is still valid (with different limiting variance), see e.g. [24] or [25,26,27] for mixing sequences, and [28] for martingale difference sequences, to mention two large classes of examples.
4.4.2 As Approximation of the Poisson Model
The Poisson model in (4.2) can also be approximated by a Gaussian one. This relies on the fact that the Poisson distribution is infinitely divisible, which means that whenever \(X \sim \text {Poi} \left( \mu \right) \), then X can be represented as \(X = X_1 + \cdots + X_n\) for any \(n \in \mathbb N\) with i.i.d. random variables \(X_1, \ldots , X_n \sim \text {Poi} \left( \mu /n\right) \). Consequently, the central limit theorem states that
$$\begin{aligned} \frac{X-\mu }{\sqrt{\mu }} \longrightarrow Z,\quad \text {as}\quad \mu \rightarrow \infty \end{aligned}$$
with \(Z \sim \mathcal N \left( 0,1\right) \). The general Berry-Esseen theorem can also be used to bound the error of an approximation of \(\frac{X-\mu }{\sqrt{\mu }}\) by Z, namely one obtains (see also [29])
$$\begin{aligned} \sup \limits _{y \in \mathbb R} \left| \mathbb {P}\left[ \frac{X-\mu }{\sqrt{\mu }} \le y\right] - \varPhi (y) \right| < \frac{5}{2} \frac{1}{\sqrt{\mu }}. \end{aligned}$$
(4.12)
Hence, if \(\mu \) is sufficiently large, the distribution of X can be approximated by a Gaussian distribution with mean and variance \(\mu \). If we suppose that \(Y_{\mathbf {x}, t}\) satisfies (4.2) and that \(\int _0^t \int _{B_{\mathbf {x}}} \lambda \left( \mathbf {y}, \tau \right) \,\mathrm d \mathbf {y}\,\mathrm d \tau \rightarrow \infty \) as \(t \rightarrow \infty \), then the above reasoning gives rise to another Gaussian model:
Gaussian model III
For each \(\mathbf {x}\in \varXi \), the number of photons observed in the bin centered at \(\mathbf {x}\) up to time t is
$$\begin{aligned} Y_{\mathbf {x}, t} \sim \mathcal N \left( \int \limits _0^t \int \limits _{B_{\mathbf {x}}} \lambda \left( \mathbf {y}, \tau \right) \,\mathrm d \mathbf {y}\,\mathrm d \tau ,\int \limits _0^t \int \limits _{B_{\mathbf {x}}} \lambda \left( \mathbf {y}, \tau \right) \,\mathrm d \mathbf {y}\,\mathrm d \tau \right) . \end{aligned}$$
(4.13)
4.4.3 Comparison
Let us briefly compare the Gaussian models I-III in (4.9), (4.11) and (4.13) respectively. It is clear that (4.11) is a generalization of (4.9) to the case of non-identical success probabilities \(p_{\mathbf {x},i}\), and both coincide if \(p_{\mathbf {x},i}\) is independent of i. To compare (4.11) with (4.13), we recall our previous computation that \(p_{\mathbf {x},1} + \cdots + p_{\mathbf {x},n} = \int _0^{t_n} \int _{B_{\mathbf {x}}} \lambda \left( \mathbf {y}, \tau \right) \,\mathrm d \mathbf {y}\,\mathrm d \tau \rightarrow \infty \) where \(t_n\) is the largest time in the sub-interval \(I_n\). Consequently, (4.11) and (4.13) differ only in the variance by \(1-p_{\mathbf {x},i}\), which is usually small. Hence, all three Gaussian models are in good agreement, and (4.13) can be considered the most simple one which should be used.
4.4.4
Thinning
Taking into account the detection efficiency \(\eta \in \left[ 0,1\right] \) as discussed before, we will arrive at models similar to (4.9), (4.11) and (4.13) with the only difference being that \(p_{\mathbf {x}}\), \(p_{\mathbf {x},i}\) or \(\lambda \) are multiplied by \(\eta \). In this sense, the canonical thinning of the Poisson or Binomial models carries over to the Gaussian one.
4.4.5 Variance Stabilization
Note that the variance in the Gaussian models I-III is always inhomogeneous, which hinders data analysis with standard methods and causes further difficulties. This can be overcome by variance stabilization. The most popular choice is the celebrated Anscombe transform, which is applied to the Poisson model (4.2) to obtain asymptotically a normal distribution with variance 1. It is based on the following result (see e.g. [30, Lemma 1]):
Lemma 4.1
(Anscombe’s transform) Let \(\mu >0\) and \(Y \sim \mathcal P \left( \mu \right) \) be a Poisson distributed random variable. Then it holds for all \(c \ge 0\) that
$$\begin{aligned} \mathbb {E}\left[ 2 \sqrt{Y + c}\right]&= 2 \sqrt{\mu } + \frac{4c-1}{4 \sqrt{\mu }} + \mathcal O \left( \frac{1}{\mu ^{\frac{3}{2}}}\right) ,\\ \mathbb {V}\left[ 2 \sqrt{Y + c}\right]&= 1 + \frac{3-8c}{8 \mu } + \mathcal O \left( \frac{1}{\mu ^2}\right) . \end{aligned}$$
From this we can conclude that the choice \(c = 3/8\) ensures that the variance of \(2 \sqrt{Y + c}\) does no longer depend on the parameter \(\mu \) up to second order. To reduce the bias, \(c = 1/4\) is the best choice. Furthermore, applying this result to the Poisson model in (4.2) gives rise to a fourth Gaussian model:
Gaussian model IV
For each \(\mathbf {x}\in \varXi \), denote the number of photons observed in the bin centered at \(\mathbf {x}\) up to time t by \(Y_{\mathbf {x}, t}\). Then we assume
$$\begin{aligned} 2\sqrt{Y_{\mathbf {x}, t} + \frac{3}{8}} \sim \mathcal N \left( 2 \left( \int \limits _0^t \int \limits _{B_{\mathbf {x}}} \lambda \left( \mathbf {y}, \tau \right) \,\mathrm d \mathbf {y}\,\mathrm d \tau \right) ^{1/2},1\right) \end{aligned}$$
(4.14)
for each \(\mathbf {x}\in \varXi \).
We emphasize the importance of the model (4.14) in statistics, as it turns out to be equivalent in a strict sense to the previously discussed Poisson model (4.2) as the total number of photons (and hence the parameter t) tends to \(\infty \) (see e.g. [31,32,33]).