DIAdatasnooping and identifiability
 535 Downloads
Abstract
In this contribution, we present and analyze datasnooping in the context of the DIA method. As the DIA method for the detection, identification and adaptation of mismodelling errors is concerned with estimation and testing, it is the combination of both that needs to be considered. This combination is rigorously captured by the DIA estimator. We discuss and analyze the DIAdatasnooping decision probabilities and the construction of the corresponding partitioning of misclosure space. We also investigate the circumstances under which two or more hypotheses are nonseparable in the identification step. By means of a theorem on the equivalence between the nonseparability of hypotheses and the inestimability of parameters, we demonstrate that one can forget about adapting the parameter vector for hypotheses that are nonseparable. However, as this concerns the complete vector and not necessarily functions of it, we also show that parameter functions may exist for which adaptation is still possible. It is shown how this adaptation looks like and how it changes the structure of the DIA estimator. To demonstrate the performance of the various elements of DIAdatasnooping, we apply the theory to some selected examples. We analyze how geometry changes in the measurement setup affect the testing procedure, by studying their partitioning of misclosure space, the decision probabilities and the minimal detectable and identifiable biases. The difference between these two minimal biases is highlighted by showing the difference between their corresponding contributing factors. We also show that if two alternative hypotheses, say \({\mathcal {H}}_{i}\) and \({\mathcal {H}}_{j}\), are nonseparable, the testing procedure may have different levels of sensitivity to \({\mathcal {H}}_{i}\)biases compared to the same \({\mathcal {H}}_{j}\)biases.
Keywords
Detection, identification and adaptation (DIA) Datasnooping Misclosure space partitioning DIA estimator Minimal detectable bias (MDB) Minimal identifiable bias (MIB) Probability of correct identification Nonseparable hypotheses1 Introduction
The DIA method for the detection, identification and adaptation of mismodelling errors combines estimation with testing. This combination of estimation and testing can be rigorously captured in the DIA estimator as introduced in (Teunissen 2017). The DIA method has already been widely employed in a variety of applications, such as the quality control of geodetic networks and the integrity monitoring of GNSS models, see, e.g., (DGCC 1982; Teunissen 1990; Salzmann 1995; Tiberius 1998; Perfetti 2006; Khodabandeh and Teunissen 2016; Zaminpardaz et al. 2015).
In this contribution, as an important example of multiple hypothesis testing, datasnooping (Baarda 1967, 1968; Teunissen 1985) is presented in the context of the DIA method. In doing so, we make use of the partitioning of misclosure space based on which we discuss the datasnooping decision probabilities and the construction of the corresponding DIA estimator. Through this partitioning, the distribution of the misclosure vector can be used to determine the correct detection (CD) and correct identification (CI) probabilities of each of the alternative hypotheses, as well as their corresponding minimal biases, the minimal detectable bias (MDB) and the minimal identifiable bias (MIB). We highlight their difference by showing the difference between their corresponding contributing factors. We also investigate the circumstances under which two or more hypotheses are nonseparable and discuss the relevant corrective actions including ‘remeasurement’, ‘adaptation’ or stating that the solution is ‘unavailable’. Of these, the adaptation step is the most involved and will be discussed in more detail.
This contribution is structured as follows. In Sect. 2, we briefly review the DIA method, describe the steps of DIAdatasnooping and define its corresponding DIA estimator. We hereby highlight the role played by the chosen partitioning of misclosure space. In Sect. 3, the decision probabilities of DIAdatasnooping are discussed, whereby between the following events are distinguished: correct acceptance (CA), false alarm (FA), correct/missed detection and correct/wrong identification. It is hereby highlighted that the MDB provides information about correct detection and not about correct identification. A high probability of correct detection does namely not necessarily imply a high probability of correct identification, unless one is dealing with the special case of having only one single alternative hypothesis.
As identification of hypotheses becomes problematic if the misclosure vector has the same distribution under different hypotheses, we study its consequences for the identification and adaptation steps in Sect. 4. We discuss the corrective actions one can choose from in terms of ‘remeasurement’, ‘adaptation’ or stating that the solution is ‘unavailable’. Of these, the adaptation step is the most involved. By means of a theorem on the equivalence between the nonseparability of hypotheses and the inestimability of parameters, we demonstrate that one can forget about adapting the complete vector of unknowns for hypotheses that are nonseparable. However, it is demonstrated that there may exist parameter functions for which adaptation is still possible. It is shown how this adaptation looks like and how it changes the structure of the DIA estimator.
To illustrate and explain the performance of the various elements of DIAdatasnooping, the theory is applied to selected examples in Sect. 5. The following three different cases are treated: heightdifference observations of a leveling network, distance measurements of a horizontal geodetic network and pseudorange measurements between a single ground station and GPS satellites. We analyze how geometry changes in the measurement setup affect the testing procedure, including its partitioning of the misclosure space, and the corresponding CD probabilities (MDB) and CI probabilities (MIB). We also demonstrate that for a given biastonoise ratio and a false alarm probability, the ordering of the CD probabilities of the alternative hypotheses is not necessarily the same as that of their CI probabilities. It is also shown if two alternative hypotheses, say \({\mathcal {H}}_{i}\) and \({\mathcal {H}}_{j}\), are not distinguishable, that the testing procedure may have different levels of sensitivity to \({\mathcal {H}}_{i}\)biases compared to the same \({\mathcal {H}}_{j}\)biases. Finally, a summary and conclusions are given in Sect. 6.
2 Detection, identification and adaptation (DIA)
2.1 DIA in brief
 1.
Detection The validity of the null hypothesis is checked by virtue of an overall model test, without the need of having to consider a particular set of alternative hypotheses. If \({\mathcal {H}}_{0}\) is accepted, \({\hat{x}}_{0}\) is provided as the estimate of x.
 2.
Identification In case \({\mathcal {H}}_{0}\) is rejected, a search is carried out among the specified alternative hypotheses \({\mathcal {H}}_{i}\) (\(i=1,\ldots ,k\)) with the purpose of pinpointing the potential source of model error. In doing so, two decisions can be made. Either one of the alternative hypotheses, say \({\mathcal {H}}_{i}\), is confidently identified, or none can be identified as such, in which case an ‘undecided’ decision is made.
 3.
Adaptation In case \({\mathcal {H}}_{i}\) is confidently identified, it is chosen as the new null hypothesis. The \({\mathcal {H}}_{0}\)based inferences are then accordingly corrected and \({\hat{x}}_{i}\) is provided as the estimate of x. However, in case the ‘undecided’ decision is made, then the solution for x is declared ‘unavailable’.
2.2 DIAdatasnooping
 1.Detection Accept \({\mathcal {H}}_{0}\) if \(t\in {\mathcal {P}}_{0}\) within which \(\Vert \cdot \Vert ^{2}_{Q_{tt}}=(\cdot )^{T}Q_{tt}^{1}(\cdot )\) and \(k_{\alpha }\) is the userchosen \(\alpha \)percentage of the central Chisquare distribution with r degrees of freedom. If \({\mathcal {H}}_{0}\) is accepted, then \({\hat{x}}_{0}\) is provided as the estimate of x. Otherwise, go to step 2.$$\begin{aligned} {\mathcal {P}}_{0}= & {} \left\{ t\in {\mathbb {R}}^{r}~\Vert t\Vert ^{2}_{Q_{tt}}\le k_{\alpha }\right\} \end{aligned}$$(6)
 2.Identification Form Baarda’s test statistic as (Baarda 1967; Teunissen 2000)in which \(c_{t_{i}}=B^{T}c_{i}\). Since \(c_{i}\) is a canonical unit vector, \(c_{t_{i}}\) is then the ith column of matrix \(B^T\). Select \({\mathcal {H}}_{i\ne 0}\) if \(t\in {\mathcal {P}}_{i\ne 0}\) with$$\begin{aligned} w_{i} \;=\; \dfrac{c^{T}_{t_{i}}Q_{tt}^{1}t}{\sqrt{c^{T}_{t_{i}}Q_{tt}^{1}c_{t_{i}}}},\quad i=1,\ldots ,k \end{aligned}$$(7)$$\begin{aligned} {\mathcal {P}}_{i\ne 0}= & {} \left\{ t\in {\mathbb {R}}^{r}/{\mathcal {P}}_{0}~w_{i}=\underset{j\in \{1,\ldots ,k\}}{\max }\;w_{j}\right\} \end{aligned}$$(8)
 3.
Adaptation If \({\mathcal {H}}_{i}\) is selected, then \({\hat{x}}_{i}\) is provided as the estimate of x.
2.3 DIA estimator
Lemma 1
(Datasnooping partitioning) The \(m+1\) regions \({\mathcal {P}}_{i}\) of (6) and (8) form a partitioning of misclosure space iff \(c_{t_{i}}\nparallel c_{t_{j}}\) for any \(i\ne j\).
Proof
See Appendix.\(\square \)
It will be clear that the conditions of the above Lemma may not always be fulfilled. The question is then which strategy to follow to deal with such a situation. Should one decide for ‘undecidedness’ if \(c_{t_{i}} \parallel c_{t_{j}}\) for some \(i \ne j\), or should one remeasure all such involved observables, or would it still be possible to perform an adaptation? An answer to these questions is provided in Sect. 4, where we consider the more general case and not restrict \(C_{i}\) to be the canonical unit vector \(c_{i}\). First, however, we discuss the testing probabilities that are involved in the detection and identification step.
3 Detection versus identification
3.1 The probabilities
3.2 Minimal detectable bias (MDB)
3.3 Minimal identifiable bias (MIB)

\(\overline{{\mathcal {P}}}_{i}\): As the integrand function in (19) is positive for all \({\tau }\in {\mathbb {R}}^{r}\), then the integral value will increase as \(\overline{{\mathcal {P}}}_{i}\) expands.

The orientation of \({\bar{c}}_{i}\) w.r.t. the borders of \(\overline{{\mathcal {P}}}_{i}\): The unit vector \({\bar{c}}_{i}\), lying within the borders of \(\overline{{\mathcal {P}}}_{i}\), determines the direction of \(E({\bar{t}}{\mathcal {H}}_{i})\) about which the PDF \(f_{{\bar{t}}}({\tau }{\mathcal {H}}_{i})\) is symmetric. The following lemma elaborates the role of the orientation of \({\bar{c}}_{i}\) in CI probability for \(r=2\). For this case, the regions \(\overline{{\mathcal {P}}}_{i}\) in (11) are defined in \({\mathbb {R}}^{2}\). Each region has then three borders of which one is curved (with \(\overline{{\mathcal {P}}}_{0}\)) and two are straight lines on either sides of \({\bar{c}}_{i}\).
Lemma 2
Proof
See the Appendix. \(\square \)

\(\Vert c_{t_{i}}\Vert _{Q_{tt}}\): The scalar \(\Vert c_{t_{i}}\Vert _{Q_{tt}}\) determines the magnitude of \(E({\bar{t}}{\mathcal {H}}_{i})\). Therefore, the larger the value of \(\Vert c_{t_{i}}\Vert _{Q_{tt}}\), the further the center of \(f_{{\bar{t}}}({\tau }{\mathcal {H}}_{i})\) gets from the origin along \({\bar{c}}_{i}\), and the larger the probability mass of \(f_{{\bar{t}}}({\tau }{\mathcal {H}}_{i})\) inside \(\overline{{\mathcal {P}}}_{i}\) will become.
4 Nonseparable hypotheses
4.1 Identifying nonseparable hypotheses
 1.
Remeasurement If in case of datasnooping, \({\mathcal {H}}_{i}\) and \({\mathcal {H}}_{j}\) are singled out in the identification step, then it is one of the two observables, \(y_{i}=c_{i}^{T}y\) or \(y_{j}=c_{j}^{T}y\), that is suspected to contain a blunder or outlier. To remedy the situation, one may then decide to replace both \(y_{i}\) and \(y_{j}\) by their remeasured values.
 2.
Adaptation If remeasurement is not an option, one might think that adaptation of \({\hat{x}}_{0}\) would be an option by extending the design matrix to \([A~C_{i}~C_{j}]\), so as to cover both the hypotheses \({\mathcal {H}}_{i}\) and \({\mathcal {H}}_{j}\). But, as the theorem below shows, this is unfortunately not possible as x will then become inestimable. Also note, despite the nonseparability of the two hypotheses, that adaptation on either \([A~C_{i}]\) or \([A~C_{j}]\) should not be pursued. Such adaptation will still produce a biased result if done for the wrong hypothesis.
 3.
Unavailability Without remeasurement or adaptation, the remaining option is to declare a solution for x to be unavailable.
Theorem 1
Proof
See the Appendix. \(\square \)
The above theorem conveys that if the alternative hypotheses \({\mathcal {H}}_{i}\) with \(i=1,\ldots ,l\) are not distinguishable, then extending the design matrix A by any two or more matrices \(C_{i}\) with \(i=1,\ldots ,l\) will result in a rankdeficient design matrix and therefore make unbiased estimability of the parameter vector x impossible. The conclusion reads therefore that if remeasurement is not an option and x is the parameter vector for which a solution is sought, the issue of nonseparable hypotheses should already be tackled at the designing phase of the measurement experiment.
4.2 Adaptation for estimable functions
The above theorem has shown that one can forget about adapting \({\hat{x}}_{0}\) for hypotheses that are nonseparable. This concerns, however, the complete vector x and not necessarily functions of x. It could still be possible that some relevant components of x or some relevant functions of x remain estimable, despite the rankdeficiency of the extended design matrix. The following theorem specifies which parameters remain estimable after the mentioned extension of the design matrix as well as presents the corresponding adaptation step for these estimable parameters.
Theorem 2
Proof
See the Appendix. \(\square \)
5 Numerical analysis
In this section, we apply the theory of the previous sections to some selected examples so as to illustrate and explain the performance of the various decision elements in DIAdatasnooping. The insight so obtained will also help us appreciate some of the more complex intricacies of the theory. The following three different cases are considered: heightdifference observations of a leveling network, distance measurements of a horizontal geodetic network and pseudorange measurements between a single ground station and GPS satellites. We analyze and illustrate how geometry changes in the measurement setup affect the testing procedure, including its partitioning of the misclosure space, and the corresponding CD probabilities (MDB) and CI probabilities (MIB). The CD probability under \({\mathcal {H}}_{i}\) (\(i=1,\ldots ,k\)) is computed based on (16) from \(\chi ^{2}(r,\lambda _{i}^{2})\), whereas the CI probability under \({\mathcal {H}}_{i}\) (\(i=1,\ldots ,k\)) is computed as described in the Appendix.
5.1 Leveling network
5.1.1 Misclosure space partitioning
5.1.2 CD and CI probabilities
The dashed curves in Fig. 2 (second row) depict \(\mathrm{P}_{\mathrm{CI}_{i}}\) as function of \(b/\sigma \). These curves (\(\mathrm{P}_{\mathrm{CI}_{i}}\)) always lie below their solid counterparts (\(\mathrm{P}_{\mathrm{CD}_{i}}\)). Like the solid graphs, these dashed graphs can be used either for comparing the MIB of different alternative hypotheses given a certain \(\mathrm{P}_{\mathrm{CI}_{i}}=\gamma _{\mathrm{CI}}\), or for comparing the corresponding \(\mathrm{P}_{\mathrm{CI}_{i}}\) of different alternative hypotheses given a certain \(b_{i}=b\). We note that despite the CD probability of \({\mathcal {H}}_{\mathsf {A}}\) being always larger than that of \({\mathcal {H}}_{\mathsf {B}}\) and \({\mathcal {H}}_{\mathsf {C}}\), the CI probability of \({\mathcal {H}}_{\mathsf {A}}\) is not always larger than that of \({\mathcal {H}}_{\mathsf {B}}\) and \({\mathcal {H}}_{\mathsf {C}}\). Depending on the number of measurements in each loop n, if \(b/\sigma \) is smaller than a certain value, then we have \(\mathrm{P}_{\mathrm{CI}_{\mathsf {A}}}<\mathrm{P}_{\mathrm{CI}_{\mathsf {B}}}=\mathrm{P}_{\mathrm{CI}_{\mathsf {C}}}\). This discrepancy between the behavior of CD probability and that of CI probability as function of \(b/\sigma \) for a given \(\alpha \) is due to the fact that while \(\mathrm{P}_{\mathrm{CD}_{i}}\) is driven only by \(\Vert c_{t_{i}}\Vert _{Q_{tt}}\), \(\mathrm{P}_{\mathrm{CI}_{i}}\) is in addition driven by \(\overline{{\mathcal {P}}}_{i}\) and the orientation of \({\bar{c}}_{i}\) w.r.t. the straight borders of \(\overline{{\mathcal {P}}}_{i}\) (cf. 19). Looking at the first row of Fig. 2, we note that \(\overline{{\mathcal {P}}}_{\mathsf {A}}\) has smaller area compared to \(\overline{{\mathcal {P}}}_{\mathsf {B}}\) and \(\overline{{\mathcal {P}}}_{\mathsf {C}}\). Therefore, b should be large enough such that \(\Vert c_{t_{\mathsf {A}}}\Vert _{Q_{tt}}>\Vert c_{t_{\mathsf {B}}}\Vert _{Q_{tt}}=\Vert c_{t_{\mathsf {C}}}\Vert _{Q_{tt}}\) can compensate for \(\overline{{\mathcal {P}}}_{\mathsf {A}}\) being smaller than \(\overline{{\mathcal {P}}}_{\mathsf {B}}\) and \(\overline{{\mathcal {P}}}_{\mathsf {C}}\).
5.1.3 Impact of partitioning on CI probability
5.2 Horizontal geodetic network
5.2.1 Misclosure space partitioning
5.2.2 CD and CI probabilities
In (b) and (c), \({\mathcal {H}}_{\mathsf {B}}\) is not distinguishable from \({\mathcal {H}}_{\mathsf {C}}\). For these hypotheses, although not identifiable from each other, we still define CI probability as \(\mathrm{P}_{\mathrm{CI}_{\mathsf {B}}}=\mathrm{P}({\bar{t}}\in \overline{{\mathcal {P}}}_{\mathsf {B}}{\mathcal {H}}_{\mathsf {B}})\) and \(\mathrm{P}_{\mathrm{CI}_{\mathsf {C}}}=\mathrm{P}({\bar{t}}\in \overline{{\mathcal {P}}}_{\mathsf {B}}{\mathcal {H}}_{\mathsf {C}})\). It can be seen that, although \({\mathcal {H}}_{\mathsf {B}}\) is not distinguishable from \({\mathcal {H}}_{\mathsf {C}}\), they are different in both the CD and CI probabilities. Also, the testing procedure is more sensitive to the biases in \(y_{\mathsf {B}}\) compared to the same biases in \(y_{\mathsf {C}}\). This is due to the fact that the observation of \(\mathsf {C}\) contributes to the misclosure vector less than the observation of \(\mathsf {B}\). The contribution of the measurement of \(\mathsf {C}\) to the misclosure vector depends on the relative orientation of \(u_{\mathsf {B}}\) w.r.t. \(u_{\mathsf {C}}\). In case \(u_{\mathsf {B}}\) is parallel to \(u_{\mathsf {A}}\) and \(u_{\mathsf {D}}\), the measurement of the point \(\mathsf {C}\) would have zero contribution to the misclosure vector and cannot be screened at all. As the angle between \(u_{\mathsf {B}}\) and \(u_{\mathsf {C}}\) decreases, the mentioned contribution increases, so does the sensitivity of the testing procedure to the biases in the measurement of \(\mathsf {C}\).
 If \(\cos \angle (u_{i},u_{i+1})=\cos 45^{\circ }\) for any \(i=1,2,3\), then

\(\cos \angle ({\bar{c}}_{i},{\bar{c}}_{i+1})=\cos 45^{\circ }\)

\(\overline{{\mathcal {P}}}_{i}\) has the same shape of \(\overline{{\mathcal {P}}}_{j}\) for any \(i\ne j\)

\(\mathrm{P}_{\mathrm{CD}_{i}}=\mathrm{P}_{\mathrm{CD}_{j}}\) and \(\mathrm{P}_{\mathrm{CI}_{i}}=\mathrm{P}_{\mathrm{CI}_{j}}\) for any \(i\ne j\)

 If \(u_{\mathsf {A}}\parallel u_{\mathsf {D}}\), then

\(\overline{{\mathcal {P}}}_{\mathsf {B}}=\overline{{\mathcal {P}}}_{\mathsf {C}}\)

\(\mathrm{P}_{\mathrm{CD}_{\mathsf {A}}}=\mathrm{P}_{\mathrm{CD}_{\mathsf {D}}}\) and \(\mathrm{P}_{\mathrm{CI}_{\mathsf {A}}}=\mathrm{P}_{\mathrm{CI}_{\mathsf {D}}}\)

 If \(u_{\mathsf {A}}\parallel u_{\mathsf {D}}\) and \(u_{\mathsf {B}}\parallel u_{\mathsf {C}}\), then

\(\overline{{\mathcal {P}}}_{\mathsf {A}}=\overline{{\mathcal {P}}}_{\mathsf {D}}\) and \(\overline{{\mathcal {P}}}_{\mathsf {B}}=\overline{{\mathcal {P}}}_{\mathsf {C}}\)

\(\overline{{\mathcal {P}}}_{\mathsf {A}}\) has the same shape of \(\overline{{\mathcal {P}}}_{\mathsf {B}}\)

\(\mathrm{P}_{\mathrm{CD}_{i}}=\mathrm{P}_{\mathrm{CD}_{j}}\) and \(\mathrm{P}_{\mathrm{CI}_{i}}=\mathrm{P}_{\mathrm{CI}_{j}}\) for any \(i\ne j\)

 If \(u_{\mathsf {A}}\parallel u_{\mathsf {D}}\) and \(u_{\mathsf {C}}\perp u_{\mathsf {A}}\), then

\(\mathrm{P}_{\mathrm{CD}_{\mathsf {A}}}=\mathrm{P}_{\mathrm{CD}_{\mathsf {B}}}=\mathrm{P}_{\mathrm{CD}_{\mathsf {D}}}\)

\(\mathrm{P}_{\mathrm{CD}_{\mathsf {B}}}\ge \mathrm{P}_{\mathrm{CD}_{\mathsf {C}}}\) and \(\mathrm{P}_{\mathrm{CI}_{\mathsf {B}}}\ge \mathrm{P}_{\mathrm{CI}_{\mathsf {C}}}\).

If \(\angle (u_{\mathsf {B}},u_{\mathsf {C}})\) decreases, so does the differences \(\mathrm{P}_{\mathrm{CD}_{\mathsf {B}}}\mathrm{P}_{\mathrm{CD}_{\mathsf {C}}}\) and \(\mathrm{P}_{\mathrm{CI}_{\mathsf {B}}}\mathrm{P}_{\mathrm{CI}_{\mathsf {C}}}\).


If \(u_{\mathsf {A}}\parallel u_{\mathsf {B}}\), \(u_{\mathsf {A}}\parallel u_{\mathsf {D}}\) and \(u_{\mathsf {C}}\perp u_{\mathsf {A}}\), then \(\mathrm{P}_{\mathrm{CD}_{\mathsf {C}}}=\mathrm{P}_{\mathrm{CI}_{\mathsf {C}}}=0\).
5.3 GPS singlepoint positioning
5.3.1 Misclosure space partitioning
5.3.2 CD and CI probabilities

\(\lambda _{3,i}\): For \(i=4\), since \(u_{j\ne 4}^{T}\,d\approx c\), it can be concluded that \(v_{3,4}\) is almost parallel to d and \(\lambda _{3,4}\approx 0\). This implies that \(\lambda _{3,4}^{1}\) is extremely large. For \(i\ne 4\), among the five remaining satellites, still there are four unit vectors which satisfy \(u_{j\ne i,4}^{T}\,d\approx c\). Therefore, the eigenvector \(v_{3,i\ne 4}\) does not deviate too much from the direction d. However, due to the presence of satellite G4 not lying on the mentioned cone, \(\lambda _{3,i\ne 4}\) is much larger than zero, implying that \(\lambda _{3,i\ne 4}^{1}\) is much smaller than \(\lambda _{3,4}^{1}\).

The angle between \((u_{i}{\bar{u}}_{\ne i})\) and \(v_{3,i}\): As shown in the skyplot in (b), while \(u_{4}\) is almost parallel to \(v_{3,4}\), \(u_{i\ne 4}\) makes an almost \(56^{\circ }\) with \(v_{3,i\ne 4}\) (almost parallel to d). For the geometry shown in (b), \({\bar{u}}_{\ne 4}\) is almost parallel to \(v_{3,4}\), whereas this is not the case with \({\bar{u}}_{\ne i}\) (\(i\ne 4\)). Therefore, we have \((u_{4}{\bar{u}}_{\ne 4})\parallel v_{3,4}\).

\(\Vert u_{i}{\bar{u}}_{\ne i}\Vert \): We can write \(\Vert u_{i}{\bar{u}}_{\ne i}\Vert ^{2}=1+\Vert {\bar{u}}_{\ne i}\Vert ^{2}2u_{i}^{T}{\bar{u}}_{\ne i}\). Since \({\bar{u}}_{\ne i}\) is computed based on five out of six unit direction vectors, its norm does not change too much for different i. Therefore, \(\Vert u_{i}{\bar{u}}_{\ne i}\Vert \) gets its minimum value for \(i=4\) as \(u_{4}\) is almost parallel to \({\bar{u}}_{\ne 4}\). However, \(\Vert u_{4}{\bar{u}}_{\ne 4}\Vert <\Vert u_{i}{\bar{u}}_{\ne i}\Vert \) is overcompensated by \(\lambda _{3,4}^{1}>\lambda _{3,i}^{1}\).
6 Conclusion and summary
In this contribution, we presented datasnooping in the context of the DIA method, discussed its decision probabilities for detection and identification and showed what options one has available when two or more of the alternative hypotheses are nonseparable.
In our discussion, we emphasized the central role that is played by the partitioning of misclosure space, both in the formation of the decision probabilities and in the construction of the DIA estimator. In case of datasnooping, the partitioning is determined by the row vectors of the basis matrix of the null space of \(A^{T}\). Through this partitioning, the distribution of the misclosure vector can be used to determine the correct detection (CD) and correct identification (CI) probabilities of each of the alternative hypotheses. These probabilities can be ‘inverted’ to determine their corresponding minimal biases, the minimal detectable bias (MDB) and the minimal identifiable bias (MIB). We highlighted their difference by showing the difference between their corresponding contributing factors. In particular, it should be realized that the MDB provides information about correct detection and not about correct identification. A high probability of correct detection does namely not necessarily imply a high probability of correct identification, unless one is dealing with the special case of having only one single alternative hypothesis.
In the identification step, one has to ascertain whether or not all the hypotheses are identifiable. Identification of hypotheses becomes problematic if the misclosure vector has the same distribution under different hypotheses. We discussed the options one can choose from in terms of ‘remeasurement’, ‘adaptation’ or stating that the solution is ‘unavailable’. Of these, the adaptation step is the most involved. By means of an equivalence between the nonseparability of hypotheses and the inestimability of parameters (cf. Theorem 1), we demonstrated that one can forget about adapting \({\hat{x}}_{0}\) for hypotheses that are nonseparable. However, as this concerns the complete vector x and not necessarily functions of x, we also demonstrated that functions of x may exist for which adaptation is still possible (cf. Theorem 2). It was shown how this adaptation looks like and how it changes the structure of the DIA estimator.
We applied the theory to selected examples so as to illustrate and explain the performance of the various elements of DIAdatasnooping. Three different cases were discussed in detail: heightdifference observations of a leveling network, distance measurements of a horizontal geodetic network and pseudorange measurements between a single ground station and GPS satellites. We analyzed and illustrated how geometry changes in the measurement setup affect the testing procedure, including its partitioning of the misclosure space, and the corresponding CD probabilities (MDB) and CI probabilities (MIB). We also demonstrated that for a given biastonoise ratio and a false alarm probability, the ordering of the CD probabilities of the alternative hypotheses is not necessarily the same as that of their CI probabilities. And we showed, if two alternative hypotheses, say \({\mathcal {H}}_{i}\) and \({\mathcal {H}}_{j}\), are not distinguishable, that the testing procedure may have different levels of sensitivity to \({\mathcal {H}}_{i}\)biases compared to the same \({\mathcal {H}}_{j}\)biases.
Notes
Acknowledgements
The second author is the recipient of an Australian Research Council (ARC) Federation Fellowship (Project Number FF0883188). This support is greatly acknowledged.
References
 Baarda W (1967) Statistical concepts in geodesy. Netherlands Geodetic Commission, Publ. on geodesy, New series 2(4)Google Scholar
 Baarda W (1968) A testing procedure for use in geodetic networks. Netherlands Geodetic Commission, Publ on geodesy, New Series 2(5)Google Scholar
 DGCC (1982) The Delft Approach for the design and computation of geodetic networks. In: “Forty years of thought” anniversary edition on the occasion of the 65th birthday of Prof W Baarda, By staff of the delft geodetic computing centre (DGCC), vol 2, pp 202–274Google Scholar
 Hawkins DM (1980) Identification of outliers, vol 11. Chapman and Hall, LondonCrossRefGoogle Scholar
 Imparato D, Teunissen PJG, Tiberius CCJM (2018) Minimal detectable and identifiable biases for quality control. Surv Rev. https://doi.org/10.1080/00396265.2018.1437947 Google Scholar
 Khodabandeh A, Teunissen PJG (2016) Singleepoch GNSS array integrity: an analytical study. In: Sneeuw N, Novák P, Crespi M, Sansò F (eds) VIII HotineMarussi symposium on mathematical geodesy: proceedings of the symposium in Rome, 17–21 June, 2013, Springer International Publishing, pp 263–272Google Scholar
 Koch KR (1999) Parameter estimation and hypothesis testing in linear models. Springer, Berlin HeidelbergCrossRefGoogle Scholar
 Kok J (1984) Statistical analysis of deformation problems using Baarda’s testing procedure. In: “Forty years of thought” anniversary volume on the occasion of prof Baarda’s 65th birthday, Delft, vol 2, pp 470–488Google Scholar
 Parkinson BW, Axelrad P (1988) Autonomous GPS integrity monitoring using the pseudorange residual. Navigation 35(2):255–274CrossRefGoogle Scholar
 Perfetti N (2006) Detection of station coordinate discontinuities within the Italian GPS fiducial network. J Geodesy 80(7):381–396CrossRefGoogle Scholar
 Robert C, Casella G (2013) Monte Carlo statistical methods. Springer, BerlinGoogle Scholar
 Salzmann M (1995) Realtime adaptation for model errors in dynamic systems. Bull Geodesique 69(2):81–91CrossRefGoogle Scholar
 Sturza M (1988) Navigation system integrity monitoring using redundant measurements. Navigation 35(4):483–501CrossRefGoogle Scholar
 Su X, Zhan X, Niu M, Zhang Y (2014) Receiver autonomous integrity monitoring availability and fault detection capability comparison between BeiDou and GPS. J Shanghai Jiaotong Univ (Sci) 19:313–324CrossRefGoogle Scholar
 Teunissen PJG (1985) Generalized inverses, adjustment, the datum problem, and Stransformation. In: Grafarend E, Sanso F (eds) Optim Des Geod Netw. Springer, BerlinGoogle Scholar
 Teunissen PJG (1990) An integrity and quality control procedure for use in multisensor integration. In: Proceedings of ION GPS1990, ION, pp 513–522Google Scholar
 Teunissen PJG (2000) Testing theory: an introduction. Series on mathematical geodesy and positioning. Delft University Press, DelftGoogle Scholar
 Teunissen PJG (2017) Distributional theory for the DIA method. J Geodesy 92:59–80. https://doi.org/10.1007/s0019001710457 CrossRefGoogle Scholar
 Teunissen PJG, AmiriSimkooei AR (2008) Leastsquares variance component estimation. J Geodesy 82(2):65–82CrossRefGoogle Scholar
 Teunissen PJG, Khodabandeh A (2013) BLUE, BLUP and the Kalman filter: some new results. J Geodesy 87(5):461–473CrossRefGoogle Scholar
 Teunissen PJG, Imparato D, Tiberius CCJM (2017) Does RAIM with correct exclusion produce unbiased positions? Sensors 17(7):1508CrossRefGoogle Scholar
 Tiberius C (1998) Recursive data processing for kinematic GPS surveying. Netherlands Geodetic Commission, Publ. on Geodesy, New series (45)Google Scholar
 Van der Marel H, Kosters A (1990) Statistical testing and quality analysis in 3d networks: (part ii) application to gps. global positioning system: an overview. In: International Association of Geodesy Symposia, Springer, vol 102, pp 290–297Google Scholar
 Van Mierlo J (1980) A testing procedure for analytic deformation measurements. In: Proceedings of internationales symposium ueber Deformationsmessungen mit Geodaetischen Methoden, Verlag Konrad Wittwer, StuttgartGoogle Scholar
 Zaminpardaz S, Teunissen PJG, Nadarajah N, Khodabandeh A (2015) GNSS arraybased ionospheric spatial gradient monitoring: precision and integrity analysis. In: Proceedings of the ION pacific PNT meeting, ION, USA, pp 799–814Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.