Keywords

1 Introduction

Several studies have indicated the occurrence of GNSS instances where working with distributions that have tails heavier than the normal would be more appropriate. In Heng et al (2011), for instance, it is shown that GPS satellite clock errors and instantaneous UREs have heavier tails than the normal distribution for about half of the satellites. Similar findings can be found in Dins et al (2015). Also in fusion studies of GPS and INS, Student’s t-distribution has been proposed as the more suited distribution, see e.g. Zhu et al. (2012), Zhong and Xu (2018), Wang and Zhou (2019). And similar findings can be found in studies of multi-sensor GPS fusion for personal and vehicular navigation (Dhital et al 2013; Al Hage et al 2019).

An appropriate class of distributions that can be used to model distributions with heavy tails is the class of elliptically contoured (EC) distributions. Many distributions belong to this class (Chmielewski 1981; Cabane et al 1981), with important examples being the multivariate normal distribution, the contaminated normal distribution and the multivariate t-distribution (Kibria and Joarder 2006; Roth 2013).

If we assume our GNSS data vector y, with mean

$$\displaystyle \begin{aligned} E(y) = Aa+Bb\;,\;y \in \mathbb{R}^{m}, a \in \mathbb{Z}^{n}, b \in \mathbb{R}^{p} \end{aligned} $$
(1)

and design matrix \([A,B]\), to be elliptically contoured, then by virtue of linearity, also the least-squares ambiguity estimator \(\hat {a}\) of a is elliptically contoured. Our starting point will therefore be to assume that the probability density function (PDF) of \(\hat {a}\) is a member from the class of EC-distributions and thus given as

$$\displaystyle \begin{aligned} {} f_{\hat{a}}(x)= \vert \Sigma_{\hat{a}\hat{a}}\vert^{-1/2} g( \Vert x-a\Vert_{\Sigma_{\hat{a}\hat{a}}}^{2}) \end{aligned} $$
(2)

where \(a \in \mathbb {Z}^{n}\), \(\Sigma _{\hat {a}\hat {a}} \in \mathbb {R}^{n \times n}\) is positive definite, and \(g: \mathbb {R} \mapsto [0, \infty )\) is a decreasing function that satisfies \(\int _{\mathbb {R}^{n}} g(x^{T}x)dx = 1\) (Cabane et al 1981; Teunissen 2020). As the PDF is completely determined by the three ingredients: the mean \(E(\hat {a})=a\), the matrix \(\Sigma _{\hat {a}\hat {a}}\), and the function g, we write \(\hat {a} \sim {\mathrm{EC}}_{n}(a, \Sigma _{\hat {a}\hat {a}}, g)\).

As \(\hat {a}\) is an unbiased estimator of \(a \in \mathbb {Z}^{n}\), the real-valued ambiguity-float solution \(\hat {a}\) is used to estimate a as \(\check {a}=\mathcal {I}(\hat {a}) \in \mathbb {Z}^{n}\), where \(\mathcal {I}: \mathbb {R}^{n} \mapsto \mathbb {Z}^{n}\) is an admissible integer estimator. Popular examples of \(\mathcal {I}(.)\) are: integer least-squares (ILS), integer bootstrapping (IB) and integer rounding (IR) (Teunissen 1998, 1999). With both \(\hat {a}\) and \(\check {a}\) available, the ambiguity residual is defined as

$$\displaystyle \begin{aligned} \check{\epsilon}=\hat{a}-\check{a} \in \mathbb{R}^{n} \end{aligned} $$
(3)

In current GNSS practice, the ambiguity residual is used for various inferences and ambiguity validation purposes (Verhagen and Teunissen 2004; Teunissen and Montenbruck 2018). To be able to do such in a statistically meaningful way requires knowledge of the PDF of \(\check {\epsilon }\).

2 Normal, Contaminated Normal and Student’s t-Distribution

Before we commence deriving the PDF of \(\check {\epsilon }\), we first provide a comparative insight into the behaviors of three EC-distributions, namely the normal, the contaminated normal and the Student t-distribution. Their g-functions are given as

$$\displaystyle \begin{aligned} \begin{array}{l} g(x) = (2 \pi)^{-\frac{m}{2}}e^{-\tfrac{1}{2}x}\;({\mathrm{normal}})\\ g(x)= (1-\epsilon)\frac{e^{-\tfrac{1}{2}x}}{(2 \pi)^{\frac{m}{2}}}+\epsilon \frac{\delta^{-\frac{m}{2}}e^{-\tfrac{1}{2\delta}x}}{(2 \pi)^{\frac{m}{2}}}\;({\mathrm{cont.norm}})\\ g(x) = \frac{\Gamma(\frac{m+d}{2})}{(d \pi)^{\frac{d}{2}}\Gamma(\frac{d}{2})}\left[1+\frac{x}{d}\right]^{-\frac{m+d}{2}}\;({\mathrm{Student}}) \end{array} \end{aligned}$$

in which \(x \in \mathbb {R}\), \(\Gamma (.)\) denotes the gamma-function, and d the degrees of freedom of the Student distribution. Both the contaminated normal and the multivariate t-distribution have tails heavier than the normal. The contaminated normal distribution is an \(\epsilon \)-mixture of two normal distributions having the same mean but \(\delta \)-proportional variance matrices. The relevance of the contaminated distribution stems from the fact that it is a finite mixture distribution particularly useful for modeling data that are thought to contain a distinct subgroup of observations and thus can be used to model experimental error or contamination.

Note, since (2) is symmetric with respect to a, that a in (2) is indeed the mean of \(\hat {a}\), \(\mathsf {E}(\hat {a})=a\). The positive-definite matrix \(\Sigma _{\hat {a}\hat {a}}\) in (2) however, is in general not the variance matrix of \(\hat {a}\). It can be shown that the variance matrix of \(\hat {a}\), which we will denote as \(Q_{\hat {a}\hat {a}}\), is a scaled version of \(\Sigma _{\hat {a}\hat {a}}\). For the above three distributions, their Q and \(\Sigma \)-matrices are related as

$$\displaystyle \begin{aligned} \begin{array}{ll} {\mathrm{Normal:}}& Q_{\hat{a}\hat{a}}=\Sigma_{\hat{a}\hat{a}}\\ {\mathrm{Cont. normal:}}& Q_{\hat{a}\hat{a}}=(1-\epsilon+\epsilon \delta)\Sigma_{\hat{a}\hat{a}}\\ {\mathrm{Student\;distr.:}}& Q_{\hat{a}\hat{a}}=\frac{d}{d-2}\Sigma_{\hat{a}\hat{a}} \end{array} \end{aligned}$$

Figure 1 shows the three univariate PDFs for the case they have the same \(\Sigma \) (left) and for the case they have the same Q (right). This shows that when the three distributions are compared with the same \(\Sigma \), the contaminated normal and Student distribution indeed have heavier tails than the normal and are also less peaked than the normal distribution. This situation changes however when the distributions are compared having the same variance. Although the contaminated and Student distribution then still have heavier tails than the normal distribution, this is less pronounced (see the zoom-ins), while now the normal distribution is the less-peaked of the three distributions. This shows that in practice one has to exercise some caution when comparing these distributions, especially since often one will already have determined or know the precision of the observables and therefore work under the assumption that the three distributions have the same variance.

Fig. 1
figure 1

The univariate PDFs of the normal (blue), contaminated normal (green; \(\epsilon =0.5\), \(\delta =5\)) and Student (red; \(d=3\)) distribution, together with their zoom-ins. Left: all PDFs have the same \(\sigma =0.1\) (thus different variances); Right: all PDFs have the same variance (thus different \(\sigma \)’s). The normal distribution is shown for \(\sigma =0.1\) (Left and Right)

3 Distribution of the Ambiguity Residual

We will now provide the PDF of the ambiguity residual \(\check {\epsilon }\) assuming that the PDF of the GNSS data is member of the class of elliptically contoured distributions. We have the following result.

Theorem

Let \(\hat {a} \sim {\mathrm{EC}}_{n}(a, \Sigma _{\hat {a}\hat {a}}, g)\) and \(\check {a}=\mathcal {I}(\hat {a})\) . Then the PDF of \(\check {\epsilon }=\hat {a}-\check {a}\) is given as

$$\displaystyle \begin{aligned} {} f_{\check{\epsilon}}(x) = \sum_{z \in \mathbb{Z}^{n}} \frac{g(\Vert x-a-z\Vert_{\Sigma_{\hat{a}\hat{a}}}^{2})}{\sqrt{\vert \Sigma_{\hat{a}\hat{a}}\vert}} p_{0}(x) \end{aligned} $$
(4)

where \(p_{0}(x)\) is the indicator function of the origin-centred pull-in region of the integer-map \(\mathcal {I}(.)\) (Teunissen 2002 ).

In constructing \(f_{\check {\epsilon }}(x)\) from \(f_{\hat {a}}(x)\) we follow the distributional steps as graphically depicted in Fig. 2:

  1. 1.

    From \(f_{\hat {a}}(x)\) we can determine the joint PDF \(f_{\hat {a}, \check {a}}(x,z)=f_{\hat {a}}(x)p_{z}(x)\), in which \(p_{z}(x)\) is the indicator function of the pull-in region of \(\mathcal {I}(.)\), centred at z.

    Fig. 2
    figure 2

    From the float PDF \(f_{\hat {a}}(x)\), via the joint PDF \(f_{\hat {a}, \check {a}}(x, z)\), to the ambiguity-residual PDF \(f_{\check {\epsilon }}(x)\)

  2. 2.

    Note that integration over x of the joint PDF gives the PDF \(f_{\hat {a}}(x)\) and that summation over z gives the probability mass function (PMF) \(\mathsf {P}[\check {a}=z]\).

  3. 3.

    Application of the PDF transformation rule on the \(1-1\) transformation between \(\hat {a}, \check {a}\) and \(\check {\epsilon }, \check {a}\) gives the joint PDF \(f_{\check {\epsilon }, \check {a}}(x, z)= f_{\hat {a}, \check {a}}(x+z,z)=f_{\hat {a}}(x+z)p_{z}(x+z)\).

  4. 4.

    Summing this joint PDF over z gives then finally \(f_{\check {\epsilon }}(x)= \sum _{z \in \mathbb {Z}^{n}}f_{\hat {a}}(x+z)p_{0}(x)\) and therefore, with the use of (2), the result (4).

Note that the domain of the PDF \(f_{\check {\epsilon }}(x)\) is that of the indicator function \(p_{0}(x)\) and thus dependent on which integer ambiguity estimator is used for computing \(\check {a}\). Also note that we have not yet made the assumption in (4) that \(a \in \mathbb {Z}^{n}\). This is the reason why a is still present in the expression of (4); otherwise it would vanish because of the infinite integer sum. This therefore allows us to consider the distribution also for non-integer values of the ambiguities. We will come back to this in Sect. 6. First however, we will consider for the case \(a \in \mathbb {Z}^{n}\), the shape of the ambiguity-residual PDF for some different EC-distributions and some different integer ambiguity estimators.

4 PDF \(f_{\check {\epsilon }}\left (x\right )\): One-Dimensional Case

For the one-dimensional univariate case, integer rounding (IR) is the only admissible integer estimator, the pull-in region of which is given by the origin-centred interval of length 1. Figure 3 shows the univariate PDF \(f_{\check {\epsilon }}(x)\) for when the data is distributed as normal (blue), contaminated normal (green) and Student (red), in case of different values for sigma (\(0.1\), \(0.25\), \(0.5\)). The PDFs at the top all have the same \(\Sigma \), while those at the bottom all have the same Q (variance). The following conclusions can be drawn:

Fig. 3
figure 3

Univariate PDF of ambiguity residual \(\check {\epsilon }=\hat {a}-\check {a}\) when the data is distributed as normal (blue), contaminated normal (green, \(\epsilon =0.5\), \(\delta =5\)) and Student (red, \(d=3\)), for different values of sigma (\(0.1\), \(0.25\), \(0.5\)). Top: same \(\Sigma \) for all three distributions. Bottom: Same Q for all three distributions

  1. 1.

    The difference between the two PDFs \(f_{\check {\epsilon }}(x)\) and \(f_{\hat {a}}(x)\) is small if \(\sigma \) is sufficiently small with respect to 1 (the length of the pull-in interval). This can be understood as follows: the smaller \(\sigma \) gets, the larger the probability of correct integer estimation (i.e. ambiguity success-rate) and thus the less uncertain the outcome of the integer estimator \(\check {a}\) becomes. The uncertainty of \(\check {\epsilon }=\hat {a}-\check {a}\) will then resemble that of \(f_{\hat {a}}(x)\).

  2. 2.

    When \(\sigma \) gets larger, the PDF \(f_{\check {\epsilon }}(x)\) tends to the uniform distribution. This happens sooner in case \(f_{\hat {a}}(x)\) is less peaked (the case ‘same \(\Sigma \)’).

  3. 3.

    Although in both cases, ‘same \(\Sigma \)’ and ‘same Q’, the tails of the contaminated normal and Student distribution are heavier than that of the normal distribution, this property is not propagated into the PDF of ambiguity-residual. For ‘same Q’, the PDF \(f_{\check {\epsilon }}(x)\) under the normal distribution has now the heaviest tail.

5 PDF \(f_{\check {\epsilon }}\left (x\right )\): Two-Dimensional Case

In the multivariate case (\(n \geq 1\)) not only the type of EC-distribution that is assumed for the data, but now also the choice of integer estimator has its impact on the PDF of the ambiguity residual. To show this, we consider the PDF of the two-dimensional double-differenced ambiguity residual vector of a single-epoch, GNSS dual-frequency geometry-free model, thereby assuming that the data follows a normal distribution. Figure 4 shows by colors the function values of the PDFs of \(\hat {a}\) and \(\check {\epsilon }=\hat {a}-\check {a}\) for three different integer estimators (IR, IB, ILS), for the case the ambiguities are in double-differenced (DD) form (top row) and for the case the ambiguities are in Z-transformed or \(\Lambda \)-decorrelated form (bottom row). As the two dimensional pull-in regions of IR, IB and ILS are a unit-square, a parallellogram and an hexagon, respectively, these are also the domains of the corresponding \(f_{\check {\epsilon }}(x)\).

Fig. 4
figure 4

PDFs of \(\hat {a}\) and \(\check {\epsilon }=\hat {a}-\check {a}\) for three different integer estimators (IR, IB, ILS), when ambiguities are in double-differenced form (top row) and \(\Lambda \)-decorrelated form (bottom row)

As the DD ambiguities are highly correlated, the contour-lines of \(f_{\hat {a}}(x)\) are very elongated (Fig. 4, top-left). The impact of this extreme elongation is seen reflected in the three PDFs of the ambiguity-residual vector (Fig. 4, top row). For IR and IB this results in multi-modality and ridges in their PDFs.This is not the case for ILS, as the shape of its pull-in region provides the best-possible approximation to the shape of the contour lines of \(f_{\hat {a}}(x)\) (Teunissen 1999).

As integer ambiguities are usually not resolved in DD-form, but rather in \(\Lambda \)-decorrelated form using the LAMBDA-method (Teunissen 1995), the corresponding PDFs are shown in the bottom row of Fig. 4. We now see, when compared to the DD-case (Fig. 4, top row), that the shapes of the three ambiguity-residual PDFs are over a larger domain similar to that of \(f_{\hat {a}}(x)\). The differences between \(f_{\check {\epsilon }}(x)\) and \(f_{\hat {a}}(x)\) are now more confined to the boundaries of the pull-in regions and are also different for the different pull-in regions. These differences will of course get smaller, the more precise the ambiguities are.

6 The Case \(a \notin \mathbb {Z}^{n}\)

So far we assumed the ambiguities to be integer. As a result the PDF of the ambiguity residual \(\check {\epsilon }=\hat {a}-\check {a}\) was shown to be symmetric with respect to the origin. This situation changes drastically however when the ambiguities fail to be integer, \(a \notin \mathbb {Z}^{n}\). Note when we change the value of a, that the EC-PDF \(f_{\hat {a}}(x)\) simply translates over this change in a, without changing its shape. This is not the case however for the PDF of the ambiguity-residual. This difference in behaviour of \(f_{\hat {a}}(x)\) and \(f_{\check {\epsilon }}(x)\) under changes of a is illustrated in Fig. 5. The lack of translational invariance in \(f_{\check {\epsilon }}(x)\) is due to the finite extent of its domain as dictated by the pull-in region. Due to this constraint, the shape of \(f_{\check {\epsilon }}(x)\) has to change when changing a over a noninteger value. Its shape will only remain the same when the change in a is over an integer value.

Fig. 5
figure 5

PDFs \(f_{\hat {a}}(x)\) (top row) and \(f_{\check {\epsilon }}(x)\) (bottom row) for different \(a \notin \mathbb {Z}\) (\(0.1\), \(0.5\)) and different \(\sigma \) (\(0.1\), \(0.25\)), when data is assumed to be distributed as normal (blue), contaminated normal (green, \(\epsilon =0.5\), \(\delta =5\)) or Student (red, \(d=3\))

7 Summary and Conclusion

In this contribution we provided the PDF \(f_{\check {\epsilon }}(x)\) of the ambiguity-residuals for the case the distribution of the GNSS data is elliptically contoured. The normal, the contaminated normal and the Student distribution were hereby taken as examples. We then evaluated several characteristics of \(f_{\check {\epsilon }}(x)\) in its dependence on both the shape of the elliptically contoured data distributions (‘same \(\Sigma \)’ vs ‘same Q’) as well as the chosen integer ambiguity estimator (IR, IB or ILS). Finally we highlighted the lack of translational invariance of \(f_{\check {\epsilon }}(x)\), which is a property that really discriminates it from the PDF \(f_{\hat {a}}(x)\) of the float ambiguities.

In many empirical GNSS studies, the evaluation of the ambiguity residuals is still done by comparing their histograms with the PDF \(f_{\hat {a}}(x)\). This is incorrect and should not be done, since, as the above has shown, the two PDFs \(f_{\hat {a}}(x)\) and \(f_{\check {\epsilon }}(x)\) can have very different characteristics. Moreover, there is no need to use \(f_{\hat {a}}(x)\) for comparative purposes, since the exact analytical expression for the PDF of the ambiguity residuals is available (cf. Fig. 4).