# Distributional theory for the DIA method

- 1.5k Downloads
- 4 Citations

## Abstract

The DIA method for the detection, identification and adaptation of model misspecifications combines estimation with testing. The aim of the present contribution is to introduce a unifying framework for the rigorous capture of this combination. By using a canonical model formulation and a partitioning of misclosure space, we show that the whole estimation–testing scheme can be captured in one single DIA estimator. We study the characteristics of this estimator and discuss some of its distributional properties. With the distribution of the DIA estimator provided, one can then study all the characteristics of the combined estimation and testing scheme, as well as analyse how they propagate into final outcomes. Examples are given, as well as a discussion on how the distributional properties compare with their usage in practice.

## Keywords

Detection, Identification and Adaptation (DIA) Tienstra transformation Baarda test statistic Misclosure partitioning Voronoi-partitioning unit sphere DIA estimator Best linear unbiased estimation (BLUE) Best linear unbiased prediction (BLUP) Hazardous probability Bias Missed detection (MD) Correct detection (CD) Correct identification (CI)## 1 Introduction

The DIA method for the detection, identification and adaptation of model misspecifications, together with its associated internal and external reliability measures, finds its origin in the pioneering work of Baarda (1967, 1968a), see also e.g. Alberda (1976), Kok (1984), Teunissen (1985). For its generalization to recursive quality control in dynamic systems, see Teunissen and Salzmann (1989), Teunissen (1990). The DIA method has found its use in a wide range of applications, for example, for the quality control of geodetic networks (DGCC 1982), for geophysical and structural deformation analyses (Van Mierlo 1980; Kok 1982), for different GPS applications (Van der Marel and Kosters 1990; Teunissen 1998b; Tiberius 1998; Hewitson et al. 2004; Perfetti 2006; Drevelle and Bonnifait 2011; Fan et al. 2011) and for various configurations of integrated navigation systems (Teunissen 1989; Salzmann 1993; Gillissen and Elema 1996).

The DIA method combines estimation with testing. Parameter estimation is conducted to find estimates of the parameters one is interested in and testing is conducted to validate these results with the aim to remove any biases that may be present. The consequence of this practice is that the method is not one of estimation only, nor one of testing only, but actually one where estimation and testing are combined.

The aim of the present contribution is to introduce a unifying framework that captures the combined estimation and testing scheme of the DIA method. This implies that one has to take the intricacies of this combination into consideration when evaluating the contributions of the various decisions and estimators involved. By using a canonical model formulation and a partitioning of misclosure space, we show that the whole estimation–testing scheme can be captured in one single estimator \(\bar{x}\). We study the characteristics of this estimator and discuss some of its distributional properties. With the distribution of \(\bar{x}\) provided, one can then study all the characteristics of the combined estimation and testing scheme, as well as analyse how they propagate into the final outcome \(\bar{x}\).

This contribution is organized as follows. After a description of the null and alternative hypotheses considered, we derive the DIA estimator \(\bar{x}\) in Sect. 2. We discuss its structure and identify the contributions from testing and estimation, respectively. We also discuss some of its variants, namely when adaptation is combined with a remeasurement of rejected data or when adaptation is only carried out for a subset of misclosure space. In Sect. 3, we derive the distribution of the estimator \(\bar{x}\). As one of its characteristics, we prove that the estimator is unbiased under \(\mathcal {H}_{0}\), but not under any of the alternative hypotheses. We not only prove this for estimation, but also for methods of prediction, such as collocation, Kriging and the BLUP. Thus although testing has the intention of removing biases from the solution, we prove that this is not strictly achieved. We show how the bias of \(\bar{x}\) can be evaluated and on what contributing factors it depends.

In Sect. 4, we decompose the distribution conditionally over the events of missed detection, correct identification and wrong identification, thus providing insight into the conditional biases as well. We also discuss in this context the well-known concept of the minimal detectable bias (MDB). In order to avoid a potential pitfall, we highlight here that the MDB is about detection and not about identification. By using the same probability of correct detection for all alternative hypotheses \(\mathcal {H}_{a}\), the MDBs can be compared and provide information on the sensitivity of rejecting the null hypothesis for \(\mathcal {H}_{a}\)-biases the size of their MDBs. The MDBs are therefore about correct detection and not about correct identification. This would only be true in the binary case, when next to the null hypothesis only a single alternative hypothesis is considered. Because of this difference between the univariate and multivariate case, we also discuss the probability of correct identification and associated minimal bias sizes.

In Sect. 5, we discuss ways of evaluating the DIA estimator. We make a distinction between unconditional and conditional evaluations and show how they relate to the procedures followed in practice. We point out, although the procedures followed in practice are usually a conditional one, that the conditional distribution itself is not strictly used in practice. In practice, any follow-on processing for which the outcome of \(\bar{x}\) is used as input, the distribution of the estimator under the identified hypothesis is used without regards to the conditioning process that led to the identified hypothesis. We discuss this difference and show how it can be evaluated. Finally, a summary with conclusions is provided in Sect. 6. We emphasize that our development will be nonBayesian throughout. Hence, the only random vectors considered are the vector of observables *y* and functions thereof, while the unknown to-be-estimated parameter vector *x* and unknown bias vectors \(b_{i}\) are assumed to be deterministic.

## 2 DIA method and principles

### 2.1 Null and alternative hypotheses

*n*, \(x \in \mathbb {R}^{n}\) the to-be-estimated unknown parameter vector, \(\mathsf {D}(.)\) the dispersion operator and \(Q_{yy} \in \mathbb {R}^{m \times m}\) the given positive-definite variance matrix of

*y*. The

*redundancy*of \(\mathcal {H}_{0}\) is \(r=m-\mathrm{rank}(A)=m-n\).

*x*is given as

*A*. As the quality of \(\hat{x}_{0}\) depends on the validity of the null hypothesis, it is important that one has sufficient confidence in \(\mathcal {H}_{0}\). Although every part of the null hypothesis can be wrong of course, we assume here that if a misspecification occurred it is confined to an underparametrization of the mean of

*y*, in which case the alternative hypothesis is of the form

*x*under \(\mathcal {H}_{i}\) becomes

As it is usually not only one single mismodelling error \(C_{i}b_{i}\) one is potentially concerned about, but quite often many more than one, a testing procedure needs to be devised for handling the various, say *k*, alternative hypotheses \(\mathcal {H}_{i}\), \(i=1, \ldots , k\). Such a procedure then usually consists of the following three steps of detection, identification and adaptation (DIA), (Baarda 1968a; Teunissen 1990; Imparato 2016):

*Detection*: An overall model test on \(\mathcal {H}_{0}\) is performed to diagnose whether an unspecified model error has occurred. It provides information on whether one can have sufficient confidence in the assumed null hypothesis, without the explicit need to specify and test any particular alternative hypothesis. Once confidence in \(\mathcal {H}_{0}\) has been declared, \(\hat{x}_{0}\) is provided as the estimate of *x*.

*Identification*: In case confidence in the null hypothesis is lacking, identification of the potential source of model error is carried out. It implies the execution of a search among the specified alternatives \(\mathcal {H}_{i}\), \(i=1, \ldots , k\), for the most likely model misspecification.

*Adaptation*: After identification of the suspected model error, a corrective action is undertaken on the \(\mathcal {H}_{0}\)-based inferences. With the null hypothesis rejected, the identified hypothesis, \(\mathcal {H}_{i}\) say, becomes the new null hypothesis and \(\hat{x}_{i}\) is provided as the estimate of *x*.

As the above steps illustrate, the outcome of testing determines how the parameter vector *x* will be estimated. Thus although estimation and testing are often treated separately and independently, in actual practice when testing is involved the two are intimately connected. This implies that one has to take the intricacies of this combination into consideration when evaluating the properties of the estimators involved. In order to help facilitate such rigorous propagation of the uncertainties, we first formulate our two working principles.

### 2.2 DIA principles

- 1.
Canonical form: as validating inferences should remain invariant for one-to-one model transformations, use will be made of a canonical version of \(\mathcal {H}_{0}\), thereby simplifying some of the derivations.

- 2.
Partitioning: to have an unambiguous testing procedure, the \(k+1\) hypotheses \(\mathcal {H}_{i}\) are assumed to induce an unambiguous partitioning of the observation space.

#### 2.2.1 Canonical model

*Tienstra*-transformation (Tienstra 1956)

*B*is an \(m \times r\) basis matrix of the null space of \(A^{T}\), i.e. \(B^{T}A=0\) and \(\mathrm{rank}(B)=r\). The Tienstra transformation is a one-to-one transformation, having the inverse \(\mathcal {T}^{-1}=[A, B^{+T}]\), with \(B^{+}=(B^{T}Q_{yy}B)^{-1}B^{T}Q_{yy}\). We have used the \(\mathcal {T}\)-transformation to canonical form also for LS-VCE (Teunissen and Amiri-Simkooei 2008) and for the recursive BLUE-BLUP (Teunissen and Khodabandeh 2013). Application of \(\mathcal {T}\) to

*y*gives under the null hypothesis (1),

*x*under \(\mathcal {H}_{0}\) and \(t \in \mathbb {R}^{r}\) is the vector of misclosures (the usage of the letter

*t*for misclosure follows from the Dutch word ’

*t*egenspraak’). Their variance matrices are given as

*t*is zero mean and stochastically independent of \(\hat{x}_{0}\), it contains all the available information useful for testing the validity of \(\mathcal {H}_{0}\). Note, since a basis matrix is not uniquely defined, that also the vector of misclosures is not uniquely defined. This is, however, of no consequence as the testing will only make use of the intrinsic information that is contained in the misclosures and hence will be invariant for any one-to-one transformation of

*t*.

*t*are still independent, but now have different means than under \(\mathcal {H}_{0}\). Note that \(C_{i}b_{i}\) gets propagated differently into the means of \(\hat{x}_{0}\) and

*t*. We call

*t*has zero expectation under \(\mathcal {H}_{0}\) (cf. 6), it is the component \(b_{t}\) of \({b}_{y}\) that is testable. The component \({b}_{\hat{x}_{0}}\) on the other hand cannot be tested. As it will be directly absorbed by the parameter vector, it is this component of the observation bias \({b}_{{y}}\) that directly influences the parameter solution \(\hat{x}_{0}\). The

*bias-to-noise ratios*(BNRs) of (9),

*A*and \({Q}_{yy}\)). The BNRs (11) were introduced by Baarda in his reliability theory to determine measures of

*internal*and

*external*reliability, respectively (Baarda 1967, 1968b, 1976; Teunissen 2000). We discuss this further in Sect. 5.2.

*x*and \(b_{i}\) under \(\mathcal {H}_{i}\). The estimator \(\hat{x}_{0}\) will not contribute to the determination of the BLUE of \(b_{i}\) as \(\hat{x}_{0}\) and

*t*are independent and the mean of \(\hat{x}_{0}\) now depends on more parameters than only those of

*x*. Thus, it is

*t*that is solely reserved for the determination of the BLUE of \(b_{i}\), which then on its turn can be used in the determination of the BLUE of

*x*under \(\mathcal {H}_{i}\). The BLUEs of

*x*and \(b_{i}\) under \(\mathcal {H}_{i}\) are therefore given as

*x*under \(\mathcal {H}_{i}\) and \(\mathcal {H}_{0}\),

#### 2.2.2 Partitioning of misclosure space

The one-to-one transformation (13) clearly shows how the vector of misclosures *t* plays its role in linking the BLUEs of the different hypotheses. This relation does, however, not yet incorporate the outcome of testing. To do so, we now apply our partitioning principle to the space of misclosures \(\mathbb {R}^{r}\) and unambiguously assign outcomes of *t* to the estimators \(\hat{x}_{i}\). Therefore, let \(\mathcal {P}_{i} \subset \mathbb {R}^{r}\), \(i=0, 1, \ldots , k\), be a *partitioning* of the *r*-dimensional misclosure space, i.e. \(\cup _{i=0}^{k} \mathcal {P}_{i} = \mathbb {R}^{r}\) and \(\mathcal {P}_{i} \cap \mathcal {P}_{j} = \{0\}\) for \(i \ne j\). Then, the unambiguous relation between *t* and \(\hat{x}_{i}\) is established by defining the testing procedure such that \(\mathcal {H}_{i}\) is selected if and only if \(t \in \mathcal {P}_{i}\). An alternative way of seeing this is as follows. Let the unambiguous testing procedure be captured by the mapping \(H: \mathbb {R}^{r} \mapsto \{0, 1, \ldots , k\}\), then the regions \(\mathcal {P}_{i}= \{ t \in \mathbb {R}^{r} |\; i=H(t) \}\), \(i=0, \ldots , k\), form a partition of misclosure space.

As the testing procedure is defined by the partitioning \(\mathcal {P}_{i} \subset \mathbb {R}^{r}\), \(i=0, \ldots , k\), any change in the partitioning will change the outcome of testing and thus the quality of the testing procedure. The choice of partitioning depends on different aspects, such as the null and alternative hypotheses considered and the required detection and identification probabilities. In the next sections, we develop our distributional results such that it holds for any chosen partitioning of the misclosure space. However, to better illustrate the various concepts involved, we first discuss a few partitioning examples.

### Example 1

\(\square \)

### Example 2

*k*hypotheses \(\mathcal {H}_{i}\) (cf. 3) be of order \(m \times (n+1)\), denote \(C_{i}=c_{i}\) and \(B^{T}c_{i}=c_{t_{i}}\), and write Baarda’s test statistic (Baarda 1967, 1968b; Teunissen 2000) as

*geodesic distance*(great arc circle) between the unit vectors

*u*and

*v*(unicity is here defined in the standard Euclidean metric). If we now define the unit vectors

*t*and \(\bar{t}\) under \(\mathcal {H}_{a}\), we have

*projected normal*distribution, which is unimodal and rotationally symmetric with respect to its mean direction \(\mu _{\bar{t}}/||\mu _{\bar{t}}||_{I_{r}}\). The scalar \(||\mu _{\bar{t}}||_{I_{r}}=\lambda _{t}\) is a measure of the peakedness of the PDF. Thus, the larger the testable BNR \(\lambda _{t}\) is, the more peaked the PDF of \(\bar{t}\) becomes. The density of \(\bar{t}\) is given in, e.g. Watson (1983). Under \(\mathcal {H}_{0}\), when \(\mu _{\bar{t}}=0\), the PDF of \(\bar{t}\) reduces to

*uniform*distribution on the unit sphere. The selection probabilities under \(\mathcal {H}_{0}\) are therefore given as \(\mathsf {P}( t \in \mathcal {P}_{i}| \mathcal {H}_{0}) = |V_{i}|\frac{\Gamma (\frac{r}{2})}{2 \pi ^{\frac{r}{2}}}\), in which \(|V_{i}|\) denotes the surface area covered by \(V_{i}\). \(\square \)

An important practical application of one-dimensional identification is *datasnooping*, i.e. the procedure in which the individual observations are screened for possible outliers (Baarda 1968a; DGCC 1982; Kok 1984). If we restrict ourselves to the case of one outlier at a time, then the \(c_{i}\)-vector of the alternative hypothesis takes the form of a canonical unit vector having 1 as its ith entry and zeros elsewhere. This would then lead to \(k=m\) regions \(\mathcal {P}_{i}\) provided not two or more of the vectors \(c_{t_{i}}=B^{T}c_{i}\) are the same. If the latter happens only one of them should be retained, as no difference between such hypotheses can then be made. This would, for instance, be the case when datasnooping is applied to levelled height differences of a single levelling loop.

### Example 3

*m*satellites. First we assume that the receiver position is known, thus reducing the design matrix to \(A=[1, \ldots , 1]^{T}\). Because of the symmetry in this model, one can expect the unit vectors \(\bar{c}_{i}\) (cf. 21), \(i=1, \ldots , m\), to have a symmetric distribution over the unit sphere and thus the datasnooping partitioning to be symmetric as well. Indeed, for the correlation between the \(w_{i}\)-statistics and therefore for the angle between the \(\bar{c}_{i}\) vectors, we have

### Example 4

*k*hypotheses \(\mathcal {H}_{i}\) (cf. 3) be of order \(m \times (n+1)\) and now consider detection and identification. Then

*x*, while in the second case, \(\hat{x}_{i}=\hat{x}_{0}-L_{i}t\) (cf. 13) is provided for the identified alternative. The false- alarm selection probabilities under \(\mathcal {H}_{0}\) are now given as \(\mathsf {P}( t \in \mathcal {P}_{i \ne 0}| \mathcal {H}_{0}) = \alpha |V_{i}| \frac{\Gamma (\frac{r}{2})}{2 \pi ^{\frac{r}{2}}}\), in which \(\alpha \) is the overall level of significance. Note, the more correlated two

*w*-test statistics \(w_{i}\) and \(w_{j}\) are, the smaller the angle \(\angle (c_{t_{i}}, c_{t_{j}})\) (see Fig. 3) and the more difficult it will be to discern between the two hypotheses \(\mathcal {H}_{i}\) and \(\mathcal {H}_{j}\), especially for small biases, see e.g. Foerstner (1983), Tiberius (1998), Yang et al. (2013b). \(\square \)

*t*. It can, however, also be formulated by means of the least-squares residual vector \(\hat{e}_{0}=y-A\hat{x}_{0}\), thus providing a perhaps more recognizable form of testing. As \(t \in \mathbb {R}^{r}\) and \(\hat{e}_{0} \in \mathbb {R}^{m}\) are related as \(t=B^{T}\hat{e}_{0}\), we have (Teunissen 2000)

*t*lies can be done using the least-squares residual vector \(\hat{e}_{0}\) obtained under \(\mathcal {H}_{0}\), without the explicit need of having to compute

*t*.

Note that in case of uncorrelated observations, i.e. \(Q_{yy}=\mathrm{diag}(\sigma _{1}^{2}, \ldots , \sigma _{m}^{2})\), the adapted design matrix \(\bar{A}_{(i)}=P_{c_{i}}^{\perp }A\) for computing the BLUE under \(\mathcal {H}_{i}\) is the original design matrix with its *ith* row replaced by zeros. Hence, in case of datasnooping, the BLUE \(\hat{x}_{i} = \bar{A}_{(i)}^{+}y=\hat{x}_{0}-A^{+}c_{i}\hat{b}_{i}\) is then the estimator with the *ith* observable *excluded*. Such is, for instance, done in exclusion-based RAIM (Kelly 1998; Yang et al. 2013a).

*single*alternative (Arnold 1981; Teunissen 2000; Kargoll 2007), but not necessarily when multiple alternatives are in play. To be able to infer the quality of the various possible partitionings, one should therefore be able to diagnose their impact on the actual output of the DIA method.

### 2.3 The DIA estimator

*x*gets estimated. As the outcome of such estimation is influenced by the testing procedure, one cannot simply assign the properties of \(\hat{x}_{0}\) or \(\hat{x}_{i}\) to the actual DIA estimator computed. That is, the actual estimator that is produced is not \(\hat{x}_{0}\) nor \(\hat{x}_{i}\), but in fact (see Fig. 4)

*indicator*functions \(p_{i}(t)\) of the regions \(\mathcal {P}_{i}\) (i.e. \(p_{i}(t)=1\) for \(t \in \mathcal {P}_{i}\) and \(p_{i}(t)=0\) elsewhere), the DIA estimator \(\bar{x}\) can be written in the compact form

*t*contribute to the estimator \(\bar{x}\). If we now make use of the transformation (13), we can obtain its counterpart for \(\bar{x}\) as

*y*. Although the structure of (33) resembles that of the linear transformation (13), note that (33) is a

*nonlinear*transformation due to the presence of

*t*in \(\bar{L}(t)\). Hence, this implies that the DIA estimator \(\bar{x}\) will

*not*have a normal distribution, even if all the individual estimators \(\hat{x}_{i}\), \(i=0, \ldots , k\), are normally distributed. In the next section, we derive the probability density function (PDF) of \(\bar{x}\). But before we continue, we first briefly describe two variations on the above-defined DIA procedure.

*Remeasurement included*In case of datasnooping, with \(Q_{yy}=\mathrm{diag}(\sigma _{1}^{2}, \ldots , \sigma _{m}^{2})\), one may take the standpoint in certain applications that rejected observations should be remeasured. After all, one may argue, the measurement setup was designed assuming the rejected observation included. If one remeasures and replaces the rejected observation, say \(y_{i}\), with the remeasured one, say \(\bar{y}_{i}\), one is actually not producing \(\hat{x}_{i}\) as output, but instead the solution for

*x*based on the following extension of the model under \(\mathcal {H}_{i}\),

*i*th row of

*A*and \(\sigma _{i}^{2}\) the variance of the rejected and remeasured observations. The BLUE of

*x*under this model is

*Undecided included*Above, each hypothesis \(\mathcal {H}_{i}\) was given its own region \(\mathcal {P}_{i}\), such that together these regions cover the whole misclosure space, \(\cup _{i=0}^{k} \mathcal {P}_{i}=\mathbb {R}^{r}\). This implies, whatever the outcome of

*t*, that one always will be producing one of the estimates \(\hat{x}_{i}\), \(i=0, \ldots , k\), even, for instance, if it would be hard to discriminate between some of the hypotheses or when selection is unconvincing. To accommodate such situations, one can generalize the procedure and introduce an undecided region \(\Omega \subset \mathbb {R}^{r}\) for which no estimator of

*x*is produced at all when \(t \in \Omega \). Thus when that happens, the decision is made that a solution for

*x*is unavailable,

*integer aperture estimation*(Teunissen 2003a). As a consequence of (36), the regions \(\mathcal {P}_{i}\) of each of the hypotheses have become smaller and now only partition a part of the misclosure space,

*w*-test statistic achieves a sufficient reduction in \(||t||_{Q_{tt}}^{2}\), i.e. whether \(||P_{c_{t_{i}}}^{\perp }t||_{Q_{tt}}^{2}=||t||_{Q_{tt}}^{2}-w_{i}^{2}\) is small enough. If this is not the case, then the undecided decision is made, see Fig. 5.

The choice made for the undecided region \(\Omega \) may of course affect the regions \(\mathcal {P}_{i \ne 0}\) of some hypotheses more than others. For instance, if one or more of the alternative hypotheses turn out to be too poorly identifiable, one may choose to have their regions \(\mathcal {P}_{i}\) completely assigned to the undecided region \(\Omega \). In that case, one would only proceed with identification for a subset of the *k* alternative hypotheses. In the limiting special case when all alternative hypotheses are considered too poorly identifiable, the undecided strategy would become one for which \(\Omega = \mathbb {R}^{r}/\mathcal {P}_{0}\). In this case, one thus computes \(\hat{x}_{0}\) if \(\mathcal {H}_{0}\) gets accepted, but states that the solution is unavailable otherwise. As a result, the testing procedure is confined to the detection step. This was, for instance, the case with earlier versions of RAIM which had detection but no exclusion functionality, see e.g. Parkinson and Axelrad (1988), Sturza (1988).

To conclude this section, we pause a moment to further highlight some of the intricacies of the estimator (32). As the construction of \(\bar{x}\) has been based on a few principles only, it is important to understand that the estimator describes the outcome of *any* DIA method. In it, we recognize the separate contributions of \(p_{i}(t)\) and \(\hat{x}_{i}\). Both contribute to the uncertainty or randomness of \(\bar{x}\). The uncertainty of testing, i.e. of detection and identification, is channelled through \(p_{i}(t)\), while the uncertainty of estimation is channelled through \(\hat{x}_{i}\). Their combined outcome provides \(\bar{x}\) as an estimator of *x*. It is hereby important to realize, however, that there are for now no a priori ’optimality’ properties that one can assign to \(\bar{x}\), despite the fact that its constituents do have some of such properties. The estimator \(\hat{x}_{0}\), for instance, is optimal under \(\mathcal {H}_{0}\) as it is then the BLUE of *x*. And in case of a single alternative hypothesis \((k=1)\), also the testing can be done in an optimal way, namely by using uniformly most powerful invariant tests. These properties, however, are individual properties that do not necessarily carry over to \(\bar{x}\). One may ask oneself, for instance, why use \(\hat{x}_{i}\) when \(\mathcal {H}_{i}\) is selected. Why not use, instead of \(\hat{x}_{i}\), an estimator that takes the knowledge of \(t \in \mathcal {P}_{i}\) into account. Also note, as the testing itself gives discrete outcomes, that the DIA estimator is a *binary* weighted average of all of the \(k+1\) \(\hat{x}_{i}\)s. But one may wonder whether this binary weighting is the best one can do if the ultimate goal is the construction of a good estimator of *x*. For instance, although the weights \(p_{i}(t)\) are binary in case of the DIA estimator, smoother weighting functions of the misclosures will provide for a larger class of estimators that may contain estimators with better performances for certain defined criteria. This in analogy with integer (I) and integer-equivariant (IE) estimation, for which the latter provides a larger class containing the optimal BIE estimator (Teunissen 2003b). And just like the nonBayesian BIE estimator was shown to have a Bayesian counterpart (Teunissen 2003b; Betti et al. 1993), the nonBayesian DIA estimator with smoother weights may find its counterpart in methods of Bayesian and information-theoretic multimodel inference (Burnham and Anderson 2002).

Answering these and similar questions on ‘optimizing’ the estimator \(\bar{x}\) is complex and not the goal of the present contribution. The aim of the present contribution is to provide a general framework that captures the testing and estimation characteristics in a combined way through the single estimator \(\bar{x}\). For that purpose, we present distributional properties of the DIA estimator \(\bar{x}\) in the next and following sections, thus making a rigorous quality evaluation of any estimator of the form of (32) possible.

## 3 The distribution of the DIA estimator

### 3.1 The joint, conditional and marginal PDFs

In order to be able to study the properties of the DIA estimator, we need its probability distribution. As its performance is driven for a large part by the misclosure vector *t*, we determine the joint PDF \(f_{\bar{x},t}(x,t)\) and the conditional PDF \(f_{\bar{x}|t}(x|t)\), next to the marginal PDF \(f_{\bar{x}}(x)\). We express their PDFs in the PDFs \(f_{\hat{x}_{0}}(x)\) and \(f_{t}(t)\) of \(\hat{x}_{0}\) and *t*, respectively. We have the following result.

### Theorem 1

*t*) Let \(\bar{x}\) be given as (32), with the \(\hat{x}_{i}\)s related to \(\hat{x}_{0}\) and

*t*as in (13). Then, the joint, conditional and marginal PDFs of the DIA estimator \(\bar{x}\) and misclosure vector

*t*can be expressed in the PDFs \(f_{\hat{x}_{0}}(x)\) and \(f_{t}(t)\) as

### Proof

See Appendix. \(\square \)

*t*. It is thus indeed a nonnormal distribution, which only will approach one when the PDF of the misclosures is sufficiently peaked. For instance, when \(f_{t}(t)=\delta (t-\tau )\) and \(\tau \in \mathcal {P}_{j}\), then \(f_{\bar{x}}(x)=f_{\hat{x}_{0}}(x+L_{j}\tau )\).

### Example 5

Quite often in practice one is not interested in the complete parameter vector \(x \in \mathbb {R}^{n}\), but rather only in certain functions of it, say \(\theta = F^{T}x \in \mathbb {R}^{p}\). As its DIA estimator is then computed as \(\bar{\theta }=F^{T}\bar{x}\), we need its distribution to evaluate its performance.

### Corollary 1

Although we will be working with \(\bar{x}\), instead of \(\bar{\theta }\), in the remaining of this contribution, it should be understood that the results provided can similarly be given for \(\bar{\theta }=F^{T}\bar{x}\) as well.

### 3.2 The mean of \(\bar{x}\) under \(\mathcal {H}_{0}\) and \(\mathcal {H}_{a}\)

The estimators \(\hat{x}_{i}\), \(i=0, 1, \ldots , k\), (cf. 2, 4) are BLUEs and therefore unbiased under their respective hypotheses, e.g. \(\mathsf {E}(\hat{x}_{0}| \mathcal {H}_{0})=x\) and \(\mathsf {E}(\hat{x}_{a}| \mathcal {H}_{a})=x\). However, as shown earlier, these are not the estimators that are actually computed when testing is involved. In that case, it is the DIA estimator \(\bar{x}\) that is produced. As unbiasedness is generally a valued property of an estimator, it is important to know the mean of \(\bar{x}\). It is given in the following theorem.

### Theorem 2

### Proof

See Appendix. \(\square \)

*always*biased under the alternative,

The mean of \(\hat{x}_{0}\), \(\bar{x}\) and \(\hat{x}_{a}\) under \(\mathcal {H}_{0}\) and \(\mathcal {H}_{a}\), with \(b_{y_{a}}=C_{a}b_{a}\) and \(\bar{b}_{y_{a}}=b_{y_{a}}-\sum _{i=1}^{k}C_{i}\beta _{i}(b_{a})\)

\(\mathcal {H}_{0}\) | \(\mathcal {H}_{a}\) | |
---|---|---|

\(\hat{x}_{0}\) | \(\mathsf {E}(\hat{x}_{0}|\mathcal {H}_{0})=x\) | \(\mathsf {E}(\hat{x}_{0}|\mathcal {H}_{a})=x+A^{+}b_{y_{a}}\) |

\(\bar{x} \) | \(\mathsf {E}(\bar{x}|\mathcal {H}_{0})=x\) | \(\mathsf {E}(\bar{x}|\mathcal {H}_{a})=x+A^{+}\bar{b}_{y_{a}}\) |

\(\hat{x}_{a}\) | \(\mathsf {E}(\hat{x}_{a}|\mathcal {H}_{0})=x\) | \(\mathsf {E}(\hat{x}_{a}|\mathcal {H}_{a})=x\) |

*t*. In the limit, when \(f_{t}(t|\mathcal {H}_{a})\) is sufficiently peaked at the testable bias \(B^{T}C_{a}b_{a}\) (cf. Theorem 2), the second term in (48) reduces to \(A^{+}C_{a}b_{a}\), thereby removing all the bias from \(\hat{x}_{0}\) under \(\mathcal {H}_{a}\). A summary of the means of \(\hat{x}_{0}\), \(\bar{x}\) and \(\hat{x}_{a}\) is given in Table 1.

### Example 6

One can also compare the weighted mean squared errors of the three estimators \(\hat{x}_{0}\), \(\bar{x}\) and \(\hat{x}_{a}\) under \(\mathcal {H}_{a}\). By making use of the fact that \(\mathsf {E}(||u||_{Q}^{2})=\mathrm{trace}(Q^{-1}Q_{uu})+\mu ^{T}Q^{-1}\mu \) if \(\mathsf {E}(u)=\mu \) and \(\mathsf {D}(u)=Q_{uu}\) (Koch 1999), we have the following result.

### Corollary 2

### 3.3 Bias in BLUP, Kriging and collocation

*z*under \(\mathcal {H}_{i}\) is then given as

### Corollary 3

## 4 Decomposition of probabilities

### 4.1 Correct and incorrect decisions

*t*, i.e. choose \(\mathcal {H}_{i}\) if \(t \in \mathcal {P}_{i}\). Such decision is correct if \(\mathcal {H}_{i}\) is true, and it is incorrect otherwise. The probabilities of such occurrences can be put into a

*probability matrix*:

*t*under \(\mathcal {H}_{j}\) and \(\mathcal {H}_{0}\) (cf. 8), the entries of the probability matrix can be computed as

*r*degrees of freedom, \(\tau ^{2} = \chi ^{2}_{\alpha }(r,0)\).

*a*, it should be understood that they differ from alternative to alternative. For each such \(\mathcal {H}_{a}\), the probability of missed detection can be evaluated as

*single*alternative hypothesis \((k=1)\). In that case, the probability of wrong identification is identically zero, \(\mathsf {P}_\mathrm{WI}=0\), and we have \(\mathsf {P}_\mathrm{CI}=1-\mathsf {P}_\mathrm{MD}\). In the general case of multiple hypotheses \((k>1)\), the available probability of correct detection is spread out over all alternative hypotheses, whether correct or wrong, thus diminishing the probability of correct identification. It is up to the designer of the testing system to ensure that \(\mathsf {P}_\mathrm{CI}\) remains large enough for the application at hand. We have more to say about \(\mathsf {P}_\mathrm{CI} \ne 1-\mathsf {P}_\mathrm{MD}\) when discussing the concept of minimal detectable biases in Sect. 4.4.

### 4.2 The bias decomposition of \(\bar{x}\)

### Theorem 3

### Proof

See Appendix. \(\square \)

The above result shows that the DIA estimator \(\bar{x}\) is not only unconditionally biased under \(\mathcal {H}_{a}\), \(b_{\bar{x}}\ne 0\), but even also when it is conditioned on *correct identification*, \(b_{\bar{x}|\mathrm CI} \ne 0\). Thus if one would repeat the measurement–estimation–testing experiment of the DIA estimator a sufficient number of times under \(\mathcal {H}_{a}\) being true and one would then be only collecting the correctly adapted outcomes of \(\bar{x}\), then still their expectation would not coincide with *x*. This fundamental result puts the often argued importance of unbiasedness (e.g. in BLUEs) in a somewhat different light.

*datasnooping*, when the vectors \(C_{i}=c_{i}\) take the form of canonical unit vectors, the conditional bias under correct identification is given as

*a*th observation, this is not true anymore for the bias contribution in the unconditional case. In the unconditional case, the bias in \(\bar{x}\) also receives contributions from the entries of \(\bar{b}_{y_{a}}\) other than its

*a*th, i.e. instead of \(\bar{b}_{y_{a}|\mathrm{CI}}\), which only has its

*a*th entry being nonzero, the vector \(\bar{b}_{y_{a}}\) has next to its

*a*th entry also its other entries being filled up with nonzero values.

### 4.3 The PDF decomposition of \(\bar{x}\) under \(\mathcal {H}_{0}\) and \(\mathcal {H}_{a}\)

Similar to the above bias decomposition, we can decompose the unconditional PDF of \(\bar{x}\) into its conditional constituents under \(\mathcal {H}_{0}\) and \(\mathcal {H}_{a}\). We have the following result.

### Theorem 4

### Proof

See Appendix. \(\square \)

*not*independent from the misclosure vector

*t*. Thus

### 4.4 On the minimal detectable bias

*r*. For the higher-dimensional case when \(b_{a}\) is a vector instead of a scalar, a similar expression can be obtained, see Teunissen (2000).

The importance of the MDB concept is that it expresses the sensitivity of the *detection* step of the DIA method in terms of minimal bias sizes of the respective hypotheses. By using the same \(\mathsf {P}_\mathrm{CD}=1-\mathsf {P}_\mathrm{MD}=\gamma _\mathrm{CD}\) for all \(\mathcal {H}_{a}\), the MDBs can be compared and provide information on the sensitivity of rejecting the null hypothesis for \(\mathcal {H}_{a}\)-biases the size of their MDBs. For instance, in case of datasnooping of a surveyor’s trilateration network, the MDBs and their mutual comparison would reveal for which observed distance in the trilateration network the rejection of \(\mathcal {H}_{0}\) would be least sensitive (namely distance with largest MDB), as well as how large its distance bias needs to be to have rejection occur with probability \(\mathsf {P}_\mathrm{CD}=\gamma _\mathrm{CD}\).

*single*alternative hypothesis is considered. For identification in the multiple hypotheses case \((k>1)\), one can, however, pose a somewhat similar question as the one that led to the MDB: what is the smallest bias of an alternative hypothesis \(\mathcal {H}_{a}\) that leads to its identification for a given CI probability. Thus similar to (75), such

*minimal identifiable bias*is found from ’inverting’

### Example 7

Since \(\mathsf {P}_\mathrm{CD} \ge \mathsf {P}_\mathrm{CI}\), one can expect the minimal identifiable bias to be larger than the MDB when \(\mathsf {P}_\mathrm{CI}=\gamma _\mathrm{CD}\). Correct identification is thus more difficult than correct detection. Their difference depends on the probability of wrongful identification. The smaller \(\mathsf {P}_\mathrm{WI}\) is, the closer \(\mathsf {P}_\mathrm{CI}\) gets to \(\mathsf {P}_\mathrm{CD}\).

We note that in the simulation studies of Koch (2016, 2017) discrepancies were reported between the MDB and simulated values. Such discrepancies could perhaps have been the consequence of not taking the difference between the MDB and the minimal identifiable bias into account.

*A*and/or the variance matrix \(Q_{yy}\)) or decide to have the regions \(\mathcal {P}_{i \ne 0}\) of the poorly identifiable hypotheses added to the undecided region. In the latter case, one would allow such hypotheses to contribute to the rejection of \(\mathcal {H}_{0}\), but not to its adaptation.

### 4.5 On the computation of \(\mathsf {P}_\mathrm{CI}\)

*N*the total number of samples and \(\tau _{i}\), \(i=1, \ldots , N\), being the samples drawn from \(f_{t}(\tau |\mathcal {H}_{a})=f_{t}(\tau -B^{T}C_{a}b_{a, \mathrm MDB}| \mathcal {H}_{0})\) and thus from the multivariate normal distribution \(\mathcal {N}(B^{T}C_{a}b_{a, \mathrm MDB}, Q_{tt})\). Whether or not the drawn sample \(\tau =\tau _{i}\) contributes to the average is regulated by the indicator function \(p_{a}(\tau )\) and thus the actual test. The approximation (83) constitutes the simplest of simulation-based numerical integration. For more advanced methods, see Robert and Casella (2013).

## 5 How to evaluate and use the DIA estimator?

### 5.1 Unconditional evaluation

*x*, it is likely to be considered a good estimator if the probability of \(\bar{x}\) being close to

*x*is sufficiently large. Although the quantification of such terms as ’close’ and ’sufficiently large’ is application dependent, we assume that a (convex) shape and size of an

*x*-centred region \(\mathcal {B}_{x} \subset \mathbb {R}^{n}\), as well as the required probability \(1-\epsilon \) of the estimator residing in it, is given. Thus if \(\mathcal {H}\) would be the true hypothesis, the estimator \(\bar{x}\) would be considered an acceptable estimator of

*x*if the inequality \(\mathsf {P}(\bar{x} \in \mathcal {B}_{x} | \mathcal {H}) \ge 1-\epsilon \), or

*hazardous*probability. Using (33) and (39) of Theorem 1, we have the following result.

### Corollary 4

*A*and \(Q_{yy}\)) and/or a testing–design (i.e. choice of \(\mathcal {P}_{i}\)’s and their partitioning) that realizes sufficiently small hazardous probabilities. One may then also take advantage of the difference between

*influential*and

*testable*biases, by testing less stringent for biases that are less influential. Under certain circumstances, one may even try to optimize the DIA estimator, by minimizing an (ordinary or weighted) average hazardous probability

### Example 8

### 5.2 Detection only: precision and reliability

In the detection-only case, the solution \(\bar{x}\) is declared unavailable when the null hypothesis is rejected, i.e. when \(t \notin \mathcal {P}_{0}\). The probability of such outcome is under \(\mathcal {H}_{0}\), the false alarm \(\mathsf {P}(t \notin \mathcal {P}_{0}|\mathcal {H}_{0})=\mathsf {P}_\mathrm{FA}\), and under \(\mathcal {H}_{a}\), the probability of correct detection \(\mathsf {P}(t \notin \mathcal {P}_{0}|\mathcal {H}_{a})=\mathsf {P}_\mathrm{CD}\). The false alarm is often user controlled by setting the appropriate size of \(\mathcal {P}_{0}\). The probability of correct detection, however, depends on which \(\mathcal {H}_{a}\) is considered and on the size of its bias \(b_{a}\). We have \(\mathsf {P}_\mathrm{CD}=\mathsf {P}_\mathrm{FA}\) for \(b_{a}=0\), but \(\mathsf {P}_\mathrm{CD} > \mathsf {P}_\mathrm{FA}\) otherwise.

*t*. As \(\mathsf {P}_\mathrm{CA}\) is user fixed and \(\hat{x}_{0} \mathop {\sim }\limits ^{\mathcal {H}_{0}} \mathcal {N}(x, Q_{\hat{x}_{0}\hat{x}_{0}})\), the hazardous probability under \(\mathcal {H}=\mathcal {H}_{0}\) is completely driven by the variance matrix \(Q_{\hat{x}_{0}\hat{x}_{0}}\) and thus by the

*precision*of \(\hat{x}_{0}\). This is not the case under \(\mathcal {H}_{a}\), however, since the hazardous probability, also referred to as hazardous missed detection (HMD) probability, becomes then dependent on \(b_{a}\) as well,

*influential*bias \(b_{\hat{x}_{0}}=A^{+}C_{a}b_{a}\), while the missed detection probability is driven by the

*testable*bias \(b_{t}=B^{T}C_{a}b_{a}\) (cf. 9).

An alternative, more conservative approach would be to directly evaluate (92) as function of the bias \(b_{a}\). As \(\mathsf {P}_\mathrm{MD}\) gets smaller, but \(\mathsf {P}(\hat{x}_{0} \notin \mathcal {B}_{x} | \mathcal {H}_{a})\) larger, for larger biases, the probability (92) will have a maximum for a certain bias. With this approach, one can thus evaluate whether the ’worst-case’ scenario \(\max _{b_{a}} \mathsf {P}_\mathrm{HMD}(b_{a})\) for each of the hypotheses still satisfies ones criterion (Ober 2003; Teunissen 2017).

As the above computations can be done without the need of having the actual measurements available, they are very useful for design verification purposes. Starting from a certain assumed design or measurement setup as described by *A* and \(Q_{yy}\), one can then infer how well the design can be expected to protect against biases in the event that one of the alternative hypotheses is true.

### 5.3 Detection and identification

### 5.4 Conditional evaluation

In practice, the outcome produced by an estimation–testing scheme is often not the end product but just an intermediate step of a whole processing chain. Such follow-on processing, for which the outcome of the DIA estimator \(\bar{x}\) is used as input, then also requires the associated quality description. This is not difficult in principle and only requires the proper forward propagation of the unconditional PDF of \(\bar{x}\). This is, however, not how it is done in practice. In practice, often two approximations are made when using the DIA estimator. The first approximation is that \(\bar{x}\) is not evaluated unconditionally, but rather conditionally on the outcome of testing. Thus instead of working with the PDF of \(\bar{x}\), one works with the PDF of \((\bar{x}|t \in \mathcal {P}_{i})\), i.e. the one conditioned on the outcome of testing, \(t \in \mathcal {P}_{i}\). The second approximation is that one then neglects this conditioning and uses the unconditional PDF of \(\hat{x}_{i}\) instead for the evaluation. But the fact that the random vector \((\bar{x}|t \in \mathcal {P}_{i})\) has the outcome \(\hat{x}_{i}\) does not mean that the two random vectors \((\bar{x}|t \in \mathcal {P}_{i})\) and \(\hat{x}_{i}\) have the same distribution. This would only be the case if \(\hat{x}_{i}\) and *t* are independent, which is true for \(\hat{x}_{0}\) and *t*, but not for \(\hat{x}_{i \ne 0}\) and *t*.

From a practical point of view, of course, it would be easiest if indeed it would suffice to work with the relatively simple normal PDFs of \(\hat{x}_{i}\). In all subsequent processing, one could then work with the PDF of \(\hat{x}_{0}\), \(\mathcal {N}(x, Q_{\hat{x}_{0}, \hat{x}_{0}})\), if the null hypothesis gets accepted, and with the PDF of \(\hat{x}_{a}\), \(\mathcal {N}(x, Q_{\hat{x}_{a}, \hat{x}_{a}})\), if the corresponding alternative hypothesis gets identified. To show what approximations are involved when doing so, we start with the case the null hypothesis gets accepted.

*t*are independent,

### Corollary 5

### Proof

Follows from an application of the PDF total probability rule and (39). \(\square \)

The difference between the two PDFs gets smaller the larger the probability \(\mathsf {P}(t \in \mathcal {P}_{a})\), thus illustrating, when under \(\mathcal {H}_{a}\), the importance of having a large enough probability of correct identification, \(\mathsf {P}_\mathrm{CI}=\mathsf {P}(t \in \mathcal {P}_{a}|\mathcal {H}_{a})\). Note that the conditional PDF is not normal (Gaussian), but can be seen as a ’weighted sum’ of shifted versions of \(f_{\hat{x}_{0}}(x)\). The more peaked \(f_{t}(\tau )\) is, the fewer terms in this ’sum’.

*t*are).

### Example 9

## 6 Summary and conclusions

It was shown that although all the \(\hat{x}_{i}\)’s are unbiased under their respective hypotheses, the estimator \(\bar{x}\) itself is *not* unbiased, except under the null hypothesis. The presence of such bias was also proven to exist in results of prediction. The nature of the bias was studied, and it was shown that a nonzero bias remains even in case of correct identification. Thus, all successful adaptations still produce results that are biased. This implies, for instance, that any successful outlier detection and exclusion method will always produce parameter outcomes that are still biased. It was shown how this bias can be evaluated and on what contributing factors it depends.

*t*, as

*detection*step in terms of minimal bias sizes of the respective hypotheses. Thus by using the same value \(\gamma _\mathrm{CD}\) for all \(\mathcal {H}_{a}\)’s, the MDBs can be compared and provide information on the sensitivity of rejecting the null hypothesis for \(\mathcal {H}_{a}\)-biases the size of their MDBs. This does, however, not necessarily provide information on the

*identifiability*of the hypotheses. We therefore introduced, in analogy of (108), a minimal identifiable bias (MIB) as the smallest bias of an alternative hypothesis that leads to its identification for a given CI probability. Would one want to compare the identifiability of alternative hypotheses for the same probability of correct detection however, then the ’forward’ computed probability

## Notes

### Acknowledgements

This contribution benefited from author’s discussions with Drs Davide Imparato and Christian Tiberius. Mrs Safoora Zaminpardaz contributed the figures. The author is the recipient of an Australian Research Council (ARC) Federation Fellowship (Project Number FF0883188). This support is gratefully acknowledged.

## References

- Alberda JE (1976) Quality control in surveying. Chart Surv 4(2):23–28Google Scholar
- Arnold S (1981) The theory of linear models and multivariate analysis, vol 2. Wiley, New YorkGoogle Scholar
- Baarda W (1967) Statistical concepts in geodesy. Netherlands Geodetic Commission, Publ. on geodesy, New series 2(4)Google Scholar
- Baarda W (1968a) A testing procedure for use in geodetic networks. Netherlands Geodetic Commission, Publ. on geodesy, New Series 2(5)Google Scholar
- Baarda W (1968b) Enkele inleidende beschouwingen tot de B-methode van toetsen. Tech. rep, Laboratorium voor Geodesie, DelftGoogle Scholar
- Baarda W (1976) Reliability and precision of geodetic networks. Tech. rep., VII Int Kurs fur Ingenieursmessungen hoher Prazision, Band I, T.H. Darmstadt, Inst. fur GeodaesieGoogle Scholar
- Betti B, Crespi M, Sanso F (1993) A geometric illustration of ambiguity resolution in GPS theory and a Bayesian approach. Manuscr Geod 18:317–330Google Scholar
- Brown R (1996) Receiver autonomous integrity monitoring. Glob Position Syst: Theory Appl 2:143–165Google Scholar
- Burnham K, Anderson D (2002) Model selection and multimodel inference: a practical information-theoretic approach. Springer, BerlinGoogle Scholar
- DGCC (1982) The Delft Approach for the Design and Computation of Geodetic Networks. In: “Forty Years of Thought” Anniversary edition on the occasion of the 65th birthday of Professor W Baarda By staff of the Delft Geodetic Computing Centre (DGCC) vol 1, pp 202–274Google Scholar
- Drevelle V, Bonnifait P (2011) A set-membership approach for high integrity height- aided satellite positioning. GPS Solut 15(4):357–368CrossRefGoogle Scholar
- Fan L, Zhai G, Chai H (2011) Study on the processing method of cycle slips under kinematic mode. Theor Math Found Comput Sci 164:175–183CrossRefGoogle Scholar
- Foerstner W (1983) Reliability and discernability of extended Gauss-Markov models. Deutsche Geod Komm 98:79–103Google Scholar
- Gillissen I, Elema I (1996) Test results of DIA: a real-time adaptive integrity monitoring procedure, used in an integrated navigation system. Int Hydrogr Rev 73(1):75–103Google Scholar
- Hewitson S, Wang J (2006) GNSS receiver autonomous integrity monitoring (RAIM) performance analysis. GPS Solut 10(3):155–170CrossRefGoogle Scholar
- Hewitson S, Kyu Lee H, Wang J (2004) Localizability analysis for GPS/Galileo receiver autonomous integrity monitoring. J Navig 57(02):245–259CrossRefGoogle Scholar
- Imparato D (2016) GNSS-based receiver autonomous integrity monitoring for aircraft navigation. Delft University of TechnologyGoogle Scholar
- Joerger M, Pervan B (2014) Solution separation and Chi-squared ARAIM for fault detection and exclusion. In: Position, location and navigation symposium-PLANS 2014, 2014 IEEE/ION, IEEE, pp 294–307Google Scholar
- Kargoll B (2007) On the theory and application of model misspecification tests in geodesy. Universitats und Landesbibliothek BonnGoogle Scholar
- Kelly R (1998) The linear model, RNP, and the near-optimum fault detection and exclusion algorithm. Glob Position Syst 5:227–259Google Scholar
- Koch K (1999) Parameter estimation and hypothesis testing in linear models. Springer, BerlinCrossRefGoogle Scholar
- Koch KR (2016) Minimal detectable outliers as measures of reliability. J Geodesy 89:483–490CrossRefGoogle Scholar
- Koch KR (2017) Expectation maximization algorithm and its minimal detectable outliers. Stud Geophys Geod 61Google Scholar
- Kok J (1982) Statistical analysis of deformation problems using Baarda’s testing procedures. In: “Forty Years of Thought” Anniversary Volume on the Occasion of Prof Baarda’s 65th Birthday, Delft vol 2, pp 470–488Google Scholar
- Kok JJ (1984) On data snooping and multiple outlier testing. US Department of Commerce, National Oceanic and Atmospheric Administration, National Ocean Service, Charting and Geodetic ServicesGoogle Scholar
- Ober P (2003) Integrity prediction and monitoring of navigation systems, vol 1. Integricom Publishers LeidenGoogle Scholar
- Parkinson B, Axelrad P (1988) Autonomous GPS integrity monitoring using the pseudorange residual. Navigation 35(2):255–274CrossRefGoogle Scholar
- Perfetti N (2006) Detection of station coordinate discontinuities within the Italian GPS fiducial network. J Geodesy 80(7):381–396CrossRefGoogle Scholar
- Robert C, Casella G (2013) Monte Carlo statistical methods. Springer, BerlinGoogle Scholar
- Salzmann M (1991) MDB: a design tool for integrated navigation systems. Bull Geodesique 65(2):109–115CrossRefGoogle Scholar
- Salzmann M (1993) Least squares filtering and testing for geodetic navigation applications. Netherlands Geodetic Commission, Publ. on geodesy, New series (37)Google Scholar
- Sturza M (1988) Navigation system integrity monitoring using redundant measurements. Navigation 35(4):483–501CrossRefGoogle Scholar
- Teunissen PJG (1985) Quality control in geodetic networks. In: Grafarend EW, Sanso F (eds) Optimization and design of geodetic networks, pp 526–547Google Scholar
- Teunissen PJG (1989) Quality control in integrated navigation systems. IEEE Aerosp Electron Syst Mag 5(7):35–41CrossRefGoogle Scholar
- Teunissen PJG (1990) An integrity and quality control procedure for use in multi sensor integration. In: Proceedings ION GPS (republished in ION Red Book Series, vol. 7, 2010)Google Scholar
- Teunissen PJG (1998a) Minimal detectable biases of GPS data. J Geodesy 72(4):236–244CrossRefGoogle Scholar
- Teunissen PJG (1998b) Quality Control and GPS. Chap. 7 in GPS for Geodesy, 2nd ed, pp 187–229Google Scholar
- Teunissen PJG (2000) Testing theory: an introduction. Delft University Press, Series on Mathematical Geodesy and PositioningGoogle Scholar
- Teunissen PJG (2003a) Integer aperture GNSS ambiguity resolution. Artif Satel 38(3):79–88Google Scholar
- Teunissen PJG (2003b) Theory of integer equivariant estimation with application to GNSS. J Geodesy 77:402–410CrossRefGoogle Scholar
- Teunissen PJG (2017) Batch and Recursive Model Validation. In: Teunissen PJG, Montenbruck O (eds) Chapter 24 in handbook of global navigation satellite systems, pp 187–229Google Scholar
- Teunissen PJG, Amiri-Simkooei AR (2008) Least-squares variance component estimation. J Geodesy 82(2):65–82Google Scholar
- Teunissen PJG, Khodabandeh A (2013) BLUE, BLUP and the Kalman filter: some new results. J Geodesy 87:461–473Google Scholar
- Teunissen PJG, Salzmann MA (1989) A recursive slippage test for use in state-space filtering. Manuscr Geod 14(6):383–390Google Scholar
- Teunissen PJG, Imparato D, Tiberius CCJM (2017) Does RAIM with correct exclusion produce unbiased positions? Sensors 17(7):1508. doi: 10.3390/s17071508 CrossRefGoogle Scholar
- Tiberius C (1998) Recursive data processing for kinematic GPS surveying. Netherlands Geodetic Commission, Publ. on Geodesy, New series (45)Google Scholar
- Tienstra J (1956) Theory of the adjustment of normally distributed observation. Argus, AmsterdamGoogle Scholar
- Van der Marel H, Kosters A (1990) Statistical testing and quality analysis in 3-d networks: (part II) application to GPS. Int Assoc Geod Symp 102:290–297CrossRefGoogle Scholar
- Van Mierlo J (1980) A testing procedure for analytic deformation measurements. Proceedings of Internationales Symposium Ueber Deformationsmessungen mit Geodaetischen Methoden. Verlag Konrad Wittwer, StuttgartGoogle Scholar
- Watson GS (1983) Statistics on spheres, vol 3. Wiley, New YorkGoogle Scholar
- Yang L, Knight N, Li Y, Rizos C (2013a) Optimal fault detection and exclusion applied in GNSS positioning. J Navig 66:683–700CrossRefGoogle Scholar
- Yang L, Wang J, Knight N, Shen Y (2013b) Outlier separability analysis with a multiple alternative hypotheses test. J Geodesy 87:591–604CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.