1 Introduction

When analyzing deformation of a certain geodetic network, it is necessary to investigate the movements of its points, which always depends on appropriate identification of displaced and stable points. Different methods and approaches can be applied for such analyses. The choice of the method usually depends on the kind of the deformation that is investigated and the type of a control network that is designed for the certain object of study. Many approaches are based on the least squares estimation (LSE), for example, Conventional Deformation Analyses (CDA) (see, e.g., Pelzer 1971; Niemeier 1985; Heck 1984; Koch 1985; Chen 1983; Hekimoglu et al. 2010). A simple comparison between results of LSE at two different measurement epochs is often not sufficient. This may result from unrecognized non-random disturbances (such as, blunders or outliers) that might affect the observation results (Shaorong 1990; Duchnowski 2011) and be a source of various unwanted effects, for example, smearing or masking (Schwarz and Kok 1993; Prószyński 2000; Hekimoglu et al. 2010). Thus, the problem of outliers applies also to deformation analyses. In many particular cases, the outlying observations can be detected and rejected from the set of observations by applying some procedures that are generally called “data cleaning” (see, Baarda 1968; Pope 1976; Schaffrin and Wang 1994; Ding and Coleman 1996; Xu 2005; Hekimoglu and Erenoglu 2007; Gui et al. 2011). However, the correct detection is not always possible (see, for example Xu 2005; Gui et al. 2011) and the outliers might affect the deformation analyses even if the procedures in question are applied. Thus, CDA is still developed or improved, for example, to cope with such more complicated cases or to increase its efficiency by applying a new alternative strategy that works on absolute deformation monitoring networks and that has been developed based on division into subnetworks which contain one objective point and reference points (see, for example, Hekimoglu et al. 2010). On the other hand, there are also other less traditional methods that can be applied in deformation analyses, for example \(R\)-estimation (see, Duchnowski 2009, 2010) or \(\hbox {M}_\mathrm{split}\) estimation (see, Wiśniewski 2009). A special variant of the latter, which concerns the estimation of a shift of parameters, was proposed in Duchnowski and Wiśniewski (2012). However, a natural method for estimating shift between two sets of observations is \(R\)-estimation.

\(R\)-estimation is one of the basic approaches to the robust estimation, the other ones are \(M\)-estimation and \(L\)-estimation (see, e.g., Huber 1981; Hampel et al. 1986). There is no doubt that \(M\)-estimation is the most common robust method that is applied for solving problems in geodesy or surveying. Since the pioneering paper (Huber 1964) \(M\)-estimation was improved and developed (see, e.g., Andrews 1974; Huber 1981; Hampel et al. 1986) and it also became an important method of robust estimation employed in geodetic applications (see, Xu 1989, 2005; Yang et al. 2002; Wiśniewski 2009, 2010; Erenoglu and Hekimoglu 2010; Guo et al. 2011). On the other hand, \(R\)-estimation which is a robust method often applied in other fields of science, is still almost unknown (or not applied) method in geodesy and surveying, The beginnings of the estimates that are derived from rank tests, namely \(R\)-estimates, are related to the paper of Hodges and Lehmann (1963), hence the alternative name for such type estimates, namely the Hodges–Lehmann estimates (HLE). Since then, the method was further developed (see, for example, Adichie 1967; Jurečkowá 1969; Kraft and van Eeden 1972; Antille 1974; Dionne 1981; Koul and Saleh 1993; Chen 2001) and its properties were analyzed (see, e.g., Bickel 1965; Hoyland 1965; van Eeden 1970; Rousseeuw and Verboven 2002; Small et al. 2006) by many authors and for many particular reasons. The method in question was also proposed to be applied to solve some problems in geodesy or surveying (see, for example Kargoll 2005; Duchnowski 2008). The second paper concerns application of \(R\)-estimates in the LS adjustment to cope with non-random errors (systematic or gross errors) that might affect observations (R-LS method). The direct application of HLE proposed in Hodges and Lehmann (1963) is possible in geodetic or surveying problems only in basic cases, namely when the functional model of geodetic observations is very simple (see, Duchnowski 2008, 2009). The subsequent applications required adaptations, among other things, introducing initial residuals (Dionne 1981; Duchnowski 2009, 2010). Then, \(R\)-estimates were applied in the new strategy for testing the stability of possible reference points that was proposed in Duchnowski (2010). This application results from high robustness of \(R\)-estimates against outliers (see, Rousseeuw and Verboven 2002; Duchnowski 2011) and in some cases can give better results than conventional methods, for example, when two of four possible reference points are unstable (Duchnowski 2010).

The applications of \(R\)-estimates in deformation analysis that were proposed so far proved their worth; however, there are still some disadvantages or limitations. Above all, this concerns the general theoretical assumptions formulated for such type of estimators, namely the assumption about the identical distribution of the respective random variables. This limits the application of the method to the case when all observation results have the same accuracy (or at least are standardized). Another disadvantage is application of initial residuals, which are useful but only in 1D networks. Hence, the main objective of the present paper is to overcome such limitations by proposing new versions of HLE which will take into account differences in the observation accuracy and which will not be based on application of initial residuals.

2 Hodges–Lehmann estimates and their new variants

Let us now focus on a two-sample problem presented in Hodges and Lehmann (1963) which is the most suitable in deformation analyses (Duchnowski 2009, 2010). Thus, let us consider two independent samples \(x_{1}, x_{2}, x_{3},{\ldots }, x_{m}\) and \(y_{1}, y_{2}, y_{3},{\ldots }, y_{n }\) which are sets of realizations of random variables \(X_{1}, X_{2},X_{3}, {\ldots }, X_{m}\), and \( Y_{1}, Y_{2}, Y_{3},{\ldots }, Y_{n}\), respectively, and let the distributions of these variables differ from each other only in a shift \(\Delta \), namely let the distribution functions be \( F(x)\) and \(G(y)=F(x - \Delta )\), respectively (e.g., Huber 1981). Hodges and Lehmann (1963) proposed to estimate the shift by applying a rank test. When the Wilcoxon test is used then one can get the well-known form of the \(R\)-estimate of the shift between two samples:

$$\begin{aligned} \hat{{\Delta }}^{R}=\mathrm{med}(y_i -x_j ) \end{aligned}$$
(1)

where med is a median and \(1\le i\le n, 1\le j\le m\). Here, the general idea behind the Wilcoxon test, which leads to the formula in Eq. (1), is to shift the second sample by a certain value \(\Delta \) until the “displacement” between the samples is unrecognizable. In other words, until half of the non-zero differences \(\left( {y_i -\Delta } \right) -x_j \) are negative and the second half is positive (the differences equal to 0 are neglected). The detailed reasoning starts with the introduction of a test statistic \(h(X_{1}, X_{2}, X_{3},{\ldots }, X_{m}, Y_{1}, Y_{2}, Y_{3},{\ldots }, Y_{n})\) for the hypothesis \(H: \Delta =0\) against the alternatives \(\Delta >0\). Let us now quote from (Hodges and Lehmann 1963) the following assumptions:

  1. (i)

    \(h(x_{1}, x_{2},{\ldots }, x_{m}, y_{1}+a, y_{2}+a,{\ldots }, y_{n}+a)=h(x, y+a)\) is a nondecreasing function of \(a\) for all \(x\) and \(y\)

  2. (ii)

    when \(\Delta = 0\) then distribution of \(h(X_{1}, X_{2}, X_{3},{\ldots }, X_{m}, Y_{1}, Y_{2}, Y_{3},{\ldots }, Y_{n})= h(X,Y)\) is symmetric about a fixed point \(\mu \) (independent of distributions of \(X_{j}\) and \(Y_{i})\)

(\(a\) and \(\mu \) are real numbers). Here, but also afterwards, we use the notation applied in the cited paper where \(X = (X_{1}, X_{2}, X_{3},{\ldots }, X_{m})\) and \(Y = (Y_{1}, Y_{2}, Y_{3},{\ldots }, Y_{n})\), but also \(x= (x_{1}, x_{2},{\ldots }, x_{m})\) and hence \(y+a = (y_{1}+a, y_{2}+a,{\ldots }, y_{n}+a)\), etc.

Then Hodges and Lehmann (1963) proposed the following estimate of the shift

$$\begin{aligned} \hat{{\Delta }}=\frac{\Delta ^{{*}}+\Delta ^{{*}{*}}}{2} \end{aligned}$$
(2)

where

$$\begin{aligned} \Delta ^{{*}}&= \sup \left\{ {\Delta : h(x, y - \Delta )>\mu } \right\} , \mathrm{and} \nonumber \\ \Delta ^{{*}{*}}&= \inf \left\{ {\Delta : h(x, y - \Delta )<\mu } \right\} \end{aligned}$$
(3)

Thus, the different variants of HLE are based on the specifically created test statistics \(h(x, y)\). For example, if \(h(x, y)\) is equal to the number of pairs (\(x_{j}, y_{i}\)) such that \(x_{j} < y_{i }(1\le i\le n,1\le j\le m)\) and \(X_{1}, X_{2}, X_{3},{\ldots }, X_{m}\), and \( Y_{1}, Y_{2}, Y_{3},{\ldots }, Y_{n}\) are assumed to be independently and identically distributed, then we obtain HLE of the shift as in Eq. (1) (Hodges and Lehmann 1963). Such chosen statistic would lead to the Mann–Whitney \(U\) test. Other variants of the statistic \(h\) are also presented in the paper mentioned.

The \(R\)-estimate of Eq. (1) can be applied directly in deformation analyses only under some assumptions. For example, when we can assume that the samples are sets of results of direct measurements of a certain quantity that were carried out at two different epochs. In more complex cases, for example in case of a geodetic network created for deformation analyses, such assumption is not acceptable. Then, we can use a different approach, namely based on application of initial residuals (Duchnowski 2009, 2010). The initial residuals are computed for the initial values of the parameters, which are assumed or just taken from the previous computations, and by applying the functional model of the network in question. However, such application is sufficient only for one dimensional networks, for example leveling networks (Duchnowski 2011). The new possible approach that overcomes such limitation is proposed and described in the next section of the present paper.

Let us now consider another important problem with application of HLE from Eq. (1) in geodetic or surveying analyses, namely the assumption about distributions of the random variables \(X_{1}, X_{2}, X_{3},{\ldots }, X_{m}\), and \( Y_{1}, Y_{2}, Y_{3},{\ldots }, Y_{n}\). These assumptions, which are described in detail in Hodges and Lehmann (1963), lead to one important conclusion that all elements of the samples \(x_{1}, x_{2}, x_{3},{\ldots }, x_{m}\) and \(y_{1}, y_{2}, y_{3},{\ldots }, y_{n}\) are assumed to have the same accuracy. In general, such assumption is not acceptable in cases of geodetic computations unless differences at the accuracies are small enough to be neglected. Thus, if we do want to consider different accuracies of measurement results then we should propose another solution. One possible approach is standardization of the initial residuals, which were mentioned earlier. However, such approach has at least one disadvantage. In such a case, we obtain the results of estimation as a dimensionless quantity and it might be difficult to come back and obtain, for example, point displacements in metric units. This might happen if two or more standardized differences applied in Eq. (1) are equal to each other but have different accuracies before standardization. Thus, the best way to deal with the problem in question is to propose a new variant of HLE that can consider different accuracies of the measurement results.

Thus, let us first assume that elements of the set \(x_{1}, x_{2}, x_{3},{\ldots }, x_{m}\) are realizations of some random variables \(X_{j }\) whose distributions belong to the class \(\fancyscript{F}_{1}\) of continuous symmetric distributions such that \(E(X_{j})=E(X)\). Similar assumption concerns the set \(y_{1}, y_{2}, y_{3},{\ldots }, y_{n }\) and the class \(\fancyscript{F}_{2}\) for which \(E(Y_{i})=E(Y)\). The all variables \(X_{j}\) and \(Y_{i}\) are assumed to be independent of each other, and their distributions can be defined as follows \(P(X_j \le u)=F(u/\sigma _{X_j } )\) and \(P(Y_i \le u)=F(u/\sigma _{Y_i } -\Delta )\), where \(\sigma \) is the respective standard deviation. Here, we focus our attention on application of new estimates in deformation analyses and hence \(x_{j}\) and \(y_{i}\) can be regarded as measurements (or their functions, see Examples). It is well grounded in theory as well as in surveying practice that measurement results are normally distributed and hence the classes \(\fancyscript{F}_{1}\) and \(\fancyscript{F}_{2}\) are classes of normal distributions. Thus, the presented assumptions are fulfilled in this case. Note that the symmetric distributions within the classes \(\fancyscript{F}_{1}\) and \(\fancyscript{F}_{2}\) contain also symmetric heavy-tailed distributions, which would be important when modeling the occurrence of outliers. Summing up, we assume that each set \(x_{1}, x_{2}, x_{3},{\ldots }, x_{m}\) and \(y_{1}, y_{2}, y_{3},{\ldots }, y_{n }\) contain realizations of different random variables that are independent, symmetrically distributed with the common respective expectations but may differ in variances.

Let us now introduce a new variant of the test statistic \(h\) such that it considers possible differences of accuracy among \(x_{i}\) and among \(y_{j}\). Assuming the respective variances \(V(X_j )=\sigma _{X_j }^2 \) and \(V(Y_i )=\sigma _{Y_i }^2 \), let the accuracy of all \(x_{j}\) and \(y_{i}\) be described by the weights \(p_{x_j } =\sigma _0^2 /\sigma _{X_j }^2 \) and \(p_{y_i } =\sigma _0^2 /\sigma _{Y_i }^2 \) (usually it is assumed that \(\sigma _0^2 =1)\), which is a very useful and often applied way of describing accuracy of measurement results in surveying problems. Now, let all the differences \(W_l =y_i -x_j , (1\le l\le nm)\), be computed and ordered. Then also \(p_l =p_{W_l } =\left( {p_{x_j }^{-1} +p_{y_i }^{-1} } \right) ^{-1}\). Let us now propose the following statistic

$$\begin{aligned} h(x,y)=\sum _{l=t-k+1}^t {p_l } \end{aligned}$$
(4)

to test the hypothesis \(H: \Delta =0\) against the alternatives \(\Delta >0\), where \(k\) is equal to the number of positive differences \(W_l , 0\le k\le t\), and \(t\) is the number of all the differences. Such function obviously fulfills the condition (i). If \(\fancyscript{F}_{1}\) and \(\fancyscript{F}_{2 }\) are classes of symmetric distributions, then the condition (ii) is also fulfilled if (see Appendix)

$$\begin{aligned} \mu =\frac{1}{2}\sum _{l=1}^t {p_l } \end{aligned}$$
(5)

One can also consider another form of the test statistic \(h(x, y)\) such that does not neglect zero differences, then let

$$\begin{aligned} h_1 (x,y)=\sum _{l=t-k+1}^t {p_l } +\frac{1}{2}\sum _{l=t-k-d+1}^{t-k} {p_l } \end{aligned}$$
(6)

where \(d\) is the number of zero differences. This statistic also fulfills conditions (i), and (ii) is satisfied for \(\mu =\frac{1}{2}\sum \nolimits _{l=1}^{mn} {p_l } \) (similarly to the function from Eq. 4).

The explicit form of HLE that is based on the test statistic of Eqs. (4), or (6), can be found through Eqs. (2) and (3). Thus, let us consider two possible cases. Suppose first that there exists a positive natural number \(b\) such that \(\mu =\sum \nolimits _{l=1}^b {p_l } \), thus \(\mu \) is a possible value of \(h(x, y)\). Then, in accordance with Eq. (3), we should find the biggest \(\Delta \) for which the sum of the weights of the positive differences is bigger than \(\mu \), thus

$$\begin{aligned} \Delta ^{{*}}&= \sup \left\{ {\Delta : h(x, y - \Delta )>\sum _{l=1}^b {p_l } } \right\} \nonumber \\&= \sup \left\{ {\Delta : \Delta \le W_b } \right\} =W_b \end{aligned}$$
(7)

(remembering that \(h(x, y - \Delta )=h(x_1 ,\ldots ,x_m , y_1 - \Delta ,\ldots ,y_n - \Delta ))\). We should also find the smallest \(\Delta \) for which the sum of the weights of the positive differences is smaller than \(\mu \), hence

$$\begin{aligned} \Delta ^{{*}{*}}&= \inf \left\{ {\Delta : h(x, y - \Delta )<\sum _{l=1}^b {p_l } } \right\} \nonumber \\&= \inf \left\{ {\Delta :\Delta \ge W_{b+1} } \right\} =W_{b+1} \end{aligned}$$
(8)

Thus, in this case \(\hat{{\Delta }}=\frac{\Delta ^{{*}}+\Delta ^{{*}{*}}}{2} =\frac{W_b +W_{b+1} }{2}\).

In the second possible case, let \(\mu =\frac{1}{2}\sum \nolimits _{l=1}^t {p_l } \) and let it be not a possible value of \(h(x, y)\). This time, let \(b\) be a positive natural number for which

$$\begin{aligned} s_1 =\sum _{l=1}^b {p_l } >\frac{1}{2}\sum _{l=1}^t {p_l }\quad \mathrm{and} \quad s_2 =\sum _{l=b}^t {p_l } >\frac{1}{2}\sum _{l=1}^t {p_l } \end{aligned}$$
(9)

Under such assumptions one can write

$$\begin{aligned} \Delta ^{{*}}&= \sup \left\{ {\Delta : h(x, y - \Delta )>\frac{1}{2}\sum _{l=1}^t {p_l } } \right\} \nonumber \\&= \left\{ {{\begin{array}{l} {\sup \left\{ {\Delta : \Delta \le W_b } \right\} =W_b \quad \hbox {if}\quad s_2 >s_1 } \\ {\sup \left\{ {\Delta : \Delta <W_b } \right\} =W_b \quad \hbox {if}\quad s_2 <s_1 } \\ \end{array} }} \right. \end{aligned}$$
(10)
$$\begin{aligned} \Delta ^{{*}{*}}&= \inf \left\{ {\Delta : h(x, y - \Delta )<\frac{1}{2}\sum _{l=1}^t {p_l } } \right\} \nonumber \\&= \left\{ {{\begin{array}{l} {\inf \left\{ {\Delta :\Delta >W_b } \right\} =W_b \quad \hbox {if}\quad s_2 >s_1 } \\ {\inf \left\{ {\Delta :\Delta \ge W_b } \right\} =W_b \quad \hbox {if}\quad s_2 <s_1 } \\ \end{array} }} \right. \end{aligned}$$
(11)

Hence, \(\hat{{\Delta }}=\frac{\Delta ^{{*}}+\Delta ^{{*}{*}}}{2} =\frac{W_b +W_b }{2}=W_b \). Note that exactly the same estimates could be derived if the test statistic \(h_{1}(x, y)\) from Eq. (6) is applied.

Both estimates obtained can be written in a following common form

$$\begin{aligned} \hat{{\Delta }}^{W}=\mathrm{medw}(W_l )=\mathrm{medw}(y_i -x_j ), \end{aligned}$$
(12)

where medw is a weighted median.

Generally, the weighted median in a set of \(n\) real numbers \(z_{i}\) with positive real weights \(p_{z_i } \) (note that in our case \(z_{i}=W_{l}\) and \(p_{z_i } =p_l )\) is a solution of the following problem (Gurwitz 1990)

$$\begin{aligned} \mathop {\min }\limits _{z\in R} f(z)=\mathop {\min }\limits _{z\in R} \sum _{i=1}^n {p_{z_i } \left| {z-z_i } \right| } \end{aligned}$$
(13)

which is equivalent to the following one

$$\begin{aligned} \sum _{i:z_i <z} {p_{z_i } }&\le \left( {\sum _{i:z_i =z} {p_{z_i } } +\sum _{i:z_i >z} {p_{z_i } } } \right) \quad \mathrm{and} \nonumber \\&\quad \left( {\sum _{i:z_i <z} {p_{z_i } } +\sum _{i:z_i =z} {p_{z_i } } } \right) \ge \sum _{i:z_i >z} {p_{z_i } } \end{aligned}$$
(14)

The alternative definitions of the weighted median, its properties as well as the algorithms of its computation can be found, e.g., in Gurwitz (1990); Yager (1998); Beliakov et al. (2011). The choice of a certain computation algorithm is not so important if the set \(z_{1}, z_{2}, z_{3},{\ldots }, z_{n }\) (or in our case the sets \(x_{1}, x_{2}, x_{3},{\ldots }, x_{n}\) and \(y_{1}, y_{2}, y_{3},{\ldots }, y_{n})\) has only a few elements as it usually is in surveying computations. In such a case, a partial heapsort method can be used which is rather easy to be carried out, however, this is not the fastest way to compute the weighted median. Here, the partial heapsort method can be carried out in the following way. First, order the sample \(z_{1}, z_{2}, z_{3},{\ldots }, z_{n,}\) then start to sum weights of \(k\) succeeding smallest \(z_{i}\). The weighted median is equal to such \(z_{k}\) for which the sum in question becomes greater that the half of the sum of all the weights \(p_{z_i } \). If the sum is exactly equal to the half of the weights, then one should take the mean of \(z_{k}\) and \(z_{k+1}\) (see, Gurwitz 1990). Thus generally, the idea behind the estimate in Eq. (12) is as follows: let the values \(y_{1}, y_{2}, y_{3},{\ldots }, y_{n}\) be shifted to the left by a certain value \(\Delta \) until the differences \(W_l \) and their weights \(p_l \) fulfill the requirement in Eq. (14). Note that if all the weights are equal to each other, then the estimate in question is equal to HLE from Eq. (1).

The new variant of the shift estimate that is introduced in Eq. (12) is based on the test statistic of Eq. (4) which depends on the weights of the ordered differences \(W_l \). Hence, it will be called the Hodges–Lehmann weighted estimate (HLWE).

Let us now apply a similar approach to the one sample problem. Thus, let us consider a set \(x_{1}, x_{2}, x_{3},{\ldots }, x_{m}\) under the same assumptions that were made for the first sample in the two sample case. Then, let the common expected value \(E(X_{j})=E(X)\) be estimated. One can propose an analogical new test statistic \(h(x)\), which could consider different accuracies of the measurement results, and the specially modified requirements (i) and (ii) (see, Hodges and Lehmann 1963). However, there is another way to derive an explicit form of HLWE of the expected value. Let a set \(-x_{1}, -x_{2}, -x_{3},{\ldots }, -x_{m}\) be created and let it stand in for the second set in the two sample problem. This set is shifted to the left by \(2E(X)\) and of course the shift can be estimated by using HLWE in Eq. (12). Hence,

$$\begin{aligned} \hat{{\Delta }}^{W}=2\hat{{E}}^{W}(X)=\mathrm{medw}(W_l )=\mathrm{medw}(x_i -(-x_j )) \end{aligned}$$
(15)

and the final form of HLWE of the expected value is as follow

$$\begin{aligned} \hat{{E}}^{W}(X)=\frac{\mathrm{medw}(x_i +x_j )}{2}=\mathrm{medw}\left( {\frac{x_i +x_j }{2}} \right) , \end{aligned}$$
(16)

where \(1\le i\le n,1\le j\le n\). This estimate is consistent with the original HLE of the expected value as well as with the respective \(R\)-estimate presented in Huber (1981) if only all the weights are the same. Note that the estimate in question is robust against outliers if \(n\ge 5\) which reduces its application in some surveying problems (Rousseeuw and Verboven 2002; Duchnowski 2009).

3 HLWE in deformation analyses

The HLWE is a natural estimate of the shift between two sets of observations in case of the random variables that are defined as at the beginning of the previous section. Thus, if the sets are properly constructed HLWE can also be applied in deformation analyses to estimate point displacements.

Consider a geodetic network established to analyze deformation of a certain structure. All the network points would have one, two or three coordinates which depends on dimension of the network. Suppose that these coordinates can be computed by applying measurement results and the coordinates of the reference points. Let this computations be carried out by using only the simplest surveying methods, for example, by adding measured height differences to the height of the reference points, by applying angular or linear intersections, resections, 3D resections, or just a polar method. Generally, each coordinate can be computed at least several times, thus one can obtain a set of such values. If the observations are independent then we can assume that the set contains the realizations of the independent random variables with the common expected value, the “true” coordinate, and with the different variances, which depend on the accuracy of the observations and the computation method applied. Suppose that the network was measured at least two times at two different epochs. Then one can create the sets in question for each coordinate and for each measurement epoch, and the change of each coordinate can be estimated by applying HLWE proposed earlier (if only coordinates can be regarded as normally distributed).

Now let us present two examples that show the application and properties of the new estimate in deformation analyses.

3.1 Example 1

Consider simple leveling networks with \(n\ge 3\) reference points and one objective point for which vertical displacement is tested. Let all differences between the objective point and the reference points be measured twice at two different epochs. Thus, \(n\) values of the height of the objective point can be computed for each of the measurement epochs. Let them create the sets \(x_{1}, x_{2}, x_{3},{\ldots }, x_{n}\) and \(y_{1}, y_{2}, y_{3},{\ldots }, y_{n}\), respectively. Let the variances of these heights be described by the following formula \(\sigma _{x_i }^2 =\sigma _{y_i }^2 =\sigma _0^2 D_i \), where \(D_{i}\) is a length of the \(i\)th leveling line given in km (\(\sigma _0^2 =1\)). Assume that the measurement errors are Gaussian with the expected value equal to zero and the variances as presented above. Let such errors be simulated 100,000 times, in each of the computation variants, and let the theoretical displacement \(\Delta \), which is equal to zero, be estimated by applying HLWE of Eq. (12), LSE, i.e., by applying the weighted mean and the traditional HLE (in this case we neglect the differences in the variances and assume that all measurement results have the same accuracy). The random errors of the simulated measurement results were generated by Mathcad 15.0. The comparison of the results obtained is presented in Table 1.

Table 1 Comparison of HLWE with LSE and HLE

The main conclusion is that the three estimates give very similar results. The results obtained for HLWE and HLE are often the same, however, this strongly depends on how the weights differ from one another. When results for both estimates are different then better results are mostly obtained for HLWE. Comparing HLWE with LSE one can say that LSEs give a little better results, however, note that HLWEs are better in about 43 % of the simulations. Similar conclusions follow from the analyses of the root-mean-squared deviations \(\hbox {RMSD}=\sqrt{\sum \nolimits _{i=1}^n {(\Delta -\hat{{\Delta }}_i )^{2}} /n}\) (where \(n\) is the number of experiments) which were computed for each of the estimates in each of the variants on the basis of 100,000 simulations.

Here, always \(\hbox {RMSD}_\mathrm{HLE} \ge \hbox {RMSD}_\mathrm{HLWE} >\hbox {RMSD}_\mathrm{LSE} \) (considering also that HLWE are equal to HLE if all the weights are the same). The differences in RMSDs are usually small and \(<\)0.1 mm.

3.2 Example 2

Let us now consider a horizontal control network with three reference points and two objective points. The objective points are not connected to each other, thus, each of the objective points with the reference points constitutes an independent subnetwork as proposed in Hekimoglu et al. (2010). The network as well as the horizontal angles and distances observed are shown in Fig. 1. Let us now simulate the measurement results for two measurement epochs under the following assumptions: the standard deviations are \(\sigma _A =5^\mathrm{cc}\) for the angles and \(\sigma _D =2\) mm for the distances; the measurement errors are Gaussian with the expected value equal to 0; both objective points are displaced and the shifts of their coordinates are \(\Delta _{X1} =40\,\hbox {mm}, \Delta _{Y1} =20\,\hbox {mm}\) as for the point 1, \(\Delta _{X2} =-30\,\hbox {mm}, \Delta _{Y2} =-10\,\hbox {mm}\) as for the point 2. The simulated measurements results are presented in Table 2.

Fig. 1
figure 1

Horizontal control network

Table 2 Simulated measurements

Let now the displacements of the objective points be estimated by applying HLWE of the shift Eq. (12). First, let the coordinates of the objective points be computed by applying the measurements and simple surveying methods, For example, the coordinate X1 can be calculated in seven different ways by using the coordinates of the reference points and different pairs of the observations. Note, that since the angles and distances are normally distributed, then within a close neighborhood of \(E(\hbox {X1})\) also X1 can be regarded as normally distributed, which fulfills the theoretical assumptions for HLWE. If we also consider the assumed standard deviations of the distances and the angles, then it is easy to calculate the standard deviations of such computed values of the coordinate X1 according to the law of propagation of variance and assuming that the reference points as fixed. The results of these calculations are listed in Table 3.

Table 3 Values of the coordinate X1 at the first epoch

Note, that each observation can be used only once to avoid dependences between the computation results. It is also worth noting that there are also other possible ways to calculate X1, namely by applying other intersections and other combinations of the observations, however, there are always seven independent ways to calculate X1. Of course, similar calculations can be carried out for X1 at the second measurement epoch as well as for the other coordinates of the objective points. Respective sets of the coordinate values can then be applied in Eq. (12) resulting in HLWE of the coordinate shifts. Note that the coordinate shift can also be estimated by applying HLWEs of the expected value from Eq. (16). Then, the point coordinates are estimated separately for both epochs and the differences between respective values are the estimates of the coordinate shifts. Of course, the same shifts can be estimated by LSE. Also in this case, the network is adjusted separately for each epoch and the shift estimate is equal to the differences of the objective point coordinates, respectively. The estimation results and the comparison are presented in Table 4. Consider now the second variant in which some observations are disturbed with non-random errors. Let four distances, namely C-1, 1-C, 2-C, C-2 be affected with the non-random error of 0.01 m, and four angles A-C-1, B-C-1, A-C-2, B-C-2 with the error of \(50^\mathrm{cc}\). Thus, the observations that concern the point C are outliers. The estimation results under such assumption are also presented in Table 4.

Table 4 Results of estimation

Let us now compare the norms of the vectors e, which contain respective values \(\hbox {e}_{i}\), for the three kinds of the estimates of the coordinate shifts. In the first variant estimation results are very similar, so are the norm values. In the second variant the differences between the estimation results as well as between the values of the norms, respectively, are more evident. One should expect such results considering the lack of the robustness of LSE and the well-known robustness of the Hodges–Lehmann estimates. There is no doubt that \(\hat{{\Delta }}^{W}\) gave much better results than \(\hat{{E}}^{W}(X)\), hence it is more robust variant of HLWE in this case. Of course this advantage of HLWEs disappears if only the outliers are cleared off the measurement results. The traditional methods of data cleaning are often based on standardized residuals of the observations from the least squares method. In the network at hand, all the standard deviations of the distance residuals are very close to each other, so are all the standard deviations of the angle residuals. Thus, we can compare the residuals before the standardization trying to find outliers. As one can see not only the real outliers, whose residuals are underlined, have large residuals (these residuals are listed in Table 5). This means that during the data cleaning one would reject some “good” results. The same conclusion concerns robust methods of \(M\)-estimation where the weights of some good observations would be reduced unnecessarily. Suppose now that outliers would be detected by applying HLWEs, namely by using the observation residuals related to these estimates. Let us apply the estimates \(\hat{{\Delta }}^{W}\), which are the most robust here. The objective point coordinates at the second epoch can then be calculated by adding the estimated coordinate shifts (\(\hat{{\Delta }}^{W}\)) to the values of the coordinates at the first epoch, respectively (for example, computed by \(\hat{{E}}^{W}\left( X \right) \) or by LSE). The observation residuals related to such computed coordinates are shown in Table 5. This time it is much easier to detect outlying observations.

Table 5 Observation residuals

Let us now try to investigate distributions of HLWEs of the coordinate shifts. Thus, let the measurement results be simulated 100,000 times under the same assumptions as in the numerical example presented above (the variant without gross errors) and let HLWEs of all four coordinate shifts be computed. Let the same shifts be estimated by applying LS method (weighted means) for the sake of comparing. The results of such Monte Carlo simulations are presented in Table 6 and the histograms of HLWEs are shown in Fig. 2.

Fig. 2
figure 2

Histograms of Monte Carlo simulations of the values a \(\hat{{\Delta }}_{X1}^W \), b \(\hat{{\Delta }}_{Y1}^W \), c \(\hat{{\Delta }}_{X2}^W \), d \(\hat{{\Delta }}_{Y2}^W (\omega \) is the frequency)

Table 6 Results of Monte Carlo simulations of shift estimation

The histograms that are presented in Fig. 2 are symmetric about the means, which are actually equal to the true shifts. Thus, HLWEs are not biased here.

The results presented in Table 6 are very similar for both HLWE and LSE. The only difference concerns their standard deviations. However, the respective standard deviations are equal to each other if they get rounded to one-tenth of a millimeter. The last column of Table 6 presents the standard deviations of LSEs computed by applying the law of propagation of variance, which are also very close to the values of standard deviations of HLWEs resulting from Monte Carlo simulations. Here, we apply HLWE to deformation analyses thus it is very important to know the accuracy of this estimate. Since the law of propagation of variance cannot be used to compute the variance of HLWE directly from the variances of the observations applied, thus application of Monte Carlo simulations seems to be the best way to assess the accuracy of HWLE.

From the practical point of view, it would be helpful to find a simplified formula for assessment of accuracy of HLWE. The results obtained in the present example, but also many other simulations which are not presented here, show that

$$\begin{aligned} \hat{{\sigma }}_\mathrm{_{HLWE}} =c\cdot \hat{{\sigma }}_\mathrm{_{LSE}}, \end{aligned}$$
(17)

where \(c\) is a proportionality factor and \(c\approx 1.07\). The difference between the standard deviation of HLWE obtained from Monte Carlo simulations and those from the Eq. (17) are usually \(\le \)1 % and very rarely \(\ge \)2 % (the maximum difference was \(\le \)5 %). Note that the formula of Eq. (17) is valid and tested for normal distributions which is however acceptable when applying HLWE in a deformation analysis.

4 Conclusions

The estimates proposed in this paper, namely the Hogdes–Lehmann weighted estimates, are based on the theoretical foundations similar to the traditional \(R\)-estimates. Thus we can suppose that the general properties of HLWE are similar to conventional HLE. The main difference is that the new estimators take into account the differences in measurement accuracy among the observations. This extends the possible application of the estimates also to more complex surveying tasks. From the practical point of view, their high robustness against outliers seems to be another important property. This can not only protect the estimation results from the influences of gross errors, but can also be applied to solve some surveying problems. HLWE would be used to test the stability of the reference points in 2D or 3D networks, for example in the case of horizontal control networks (similar method was proposed in Duchnowski (2010), but concerned leveling networks). However, such application surly requires investigating of robustness of HLWE in greater detail.

The paper presents two variants of HLWE. Theoretical properties as well as numerical test, see Example 2, show that the HLWE of the shift is a better choice for geodetic or surveying tasks. This follows from its robustness also in case of very small samples. Note, that values of this variant of HLWE is less affected by gross errors, Example 2.

The paper presents also a new approach of how to apply HLWE (but also other variants of HLE) in geodetic or surveying problems. It is based on computing coordinates of, for example, objective points by using all possible independent ways and applying such created sets to HLWE. This approach seems to be more natural, easier to be carried out and more convenient in deformation analysis than the previously applied method based on initial residuals.

The last, but not least, conclusion concerns comparison of HLWE with traditional LSE when observation errors are assumed to be normally distributed. Example 1 shows that if the observation set is free of outliers, then RMSDs of HLWE of the shift are usually only a little bigger than those for LSE. The same conclusion follows from Example 2 and concerns the standard deviations of both estimates. Thus, generally the accuracies of these estimates are very similar. Such knowledge is applied in the simple formula of Eq. (17) and the standard deviation of HLWE can be computed in a very practical way.

5 Appendix

Let us consider a distribution of the statistic \(h(X,Y)\) from Eq. (4). First, let us investigate the symmetry of the distributions of \(W_l (X,Y)=Y_i -X_j \) under the assumption that \(\Delta = 0\). Thus, if \(\fancyscript{F}_{1}\) and \(\fancyscript{F}_{2 }\) are classes of symmetric distributions then \(\mu _3 (Y_i )=0\) and also \(\mu _3 (X_j )=0\), where \(\mu _3 \) is the third central moment of respective variable. Let us now consider \(\mu _3 (W_l )\). Since \(\Delta = 0\) then \(E(Y)=E(X)\), hence \(E(Y_i )=E(X_j )\). Applying this equality and considering that variables \(Y_i \) and \(X_j \) are independent, we can obtain

$$\begin{aligned} \mu _3 (W_l )&= \mu _3 (Y_i -X_j )=E\left( {Y_i -X_j -\underbrace{E(Y_i -X_j )}_0} \right) ^{3} \\&= E\left( {Y_i^3 -3Y_i^2 X_j+3Y_i X_j^2 -X_j^3 } \right) \\&= E\left( Y_i^3 \right) -3E\left( Y_i^2 \right) E(X_j )+3E(Y_i )E\left( X_j^2 \right) \\&-E(X_j^3 )+\underbrace{2E^{3}(Y_i )-2E^{3}(X_j )}_0 \\&= \underbrace{E\left( Y_i^3 \right) -3E(Y_i^2 )E(Y_i )+2E^{3}(Y_i )}_{\mu _3 (Y_i )} \\&-\left( {\underbrace{E\left( X_j^3 \right) -3E(X_j )E\left( X_j^2 \right) +2E^{3}(X_j )}_{\mu _3 (X_j )}} \right) \\&= \mu _3 (Y_i )-\mu _3 (X_j )=0 \end{aligned}$$

Thus, if distributions of \(Y_i \) and \(X_j \) are symmetric then also distributions of \(W_l (X,Y)=Y_i -X_j \) are symmetric (for all \(1\le i\le n,1\le j\le m,1\le l\le nm)\), and additionally since \(E(Y_i )=E(X_j )\) then \(E(W_l (X,Y)) = 0\). Note that, the symmetry of the distributions of \(Y_i \) and \(X_j \) is a sufficient condition here. The necessary and sufficient condition is that \(\mu _3 (Y_i )=\mu _3 (X_j )\), hence \(\fancyscript{F}_{1}\) and \(\fancyscript{F}_{2 }\) may also be classes of non-symmetric distributions; however, such case will not be discussed here.

Considering Eq. (4), \(h(X,Y)\) is a discrete random variable and its possible values can be computed as \(h_r =\sum \nolimits _{l=1}^{nm} {k_l^r \cdot p_l } \) for \(1\le r\le 2^{nm}\) where: \(p_l =p_{W_l } \) is the weight of the \(l\)th difference \(W_l (X,Y)\), and \(k_l^r =1\) if \(W_l (X,Y)>0, k_l^r =0\) if \(W_l (X,Y)\le 0\). Let us now consider pairs of the values \(h_c \) and \(h_d \) such that \(k_l^c \ne k_l^d \) for all \(l\) (all the possible values create \(2^{nm-1}\) such pairs). One of the pairs in question is the pair of the minimal and maximal values of \(h(X,Y)\), namely \(h_c =0\) if all \(k_l^c =0\) and \(h_d =\sum \nolimits _{l=1}^{nm} {p_l } \) if all \(k_l^d =1\). Thus, all the other pairs \(h_c \) and \(h_d \) must lie within the interval \(\left\langle {0,\sum \nolimits _{l=1}^{nm} {p_l } } \right\rangle \) and what is more, each pair is symmetrically placed with respect to the interval ends. Additionally, since all \(W_l (X,Y)\) are symmetrically distributed about 0 then the probabilities of both values \(h_c \) and \(h_d \) are equal to each other \(P\left( {h(X,Y)=h_c } \right) =P\left( {h(X,Y)=h_d } \right) \). (Note, that if all \(W_l (X,Y)\) were independent of each other, then this probability would be equal to \(0.5^{nm}\)). Hence, \(h(X,Y)\) is symmetrically distributed about the center of the interval, namely \(\mu =\frac{1}{2}\sum _{l=1}^{nm} {p_l } \).

To illustrate the theoretical considerations presented above, let us now present the results of Monte Carlo simulations of \(h(X,Y)\). Thus, let \(\Delta = 0\) and let the variables \(X_{j }\) and \(Y_{i}\) be normally distributed with the standard deviations \(\sigma _{X_j } \) and \(\sigma _{Y_i } \), respectively. Let us consider four example variants:

  1. (a)

    \( m=n=3, \sigma _{X_1 } =4, \sigma _{X_2 } =3, \sigma _{X_3 } =2, \sigma _{Y_1 } =3,\) \(\quad \sigma _{Y_2 } =1, \sigma _{Y_3 } =2\)

  2. (b)

    \(m=n=4, \sigma _{X_1 } =4, \sigma _{X_2 } =6, \sigma _{X_3 } =1, \sigma _{X_4 } =1, \) \(\quad \sigma _{Y_1 } =1, \sigma _{Y_2 } =3, \sigma _{Y_3 } =2, \sigma _{Y_4 } =5\)

  3. (c)

    \( m=n=5, \sigma _{X_1 } =\sigma _{X_2 } =\sigma _{X_3 } =\sigma _{X_4 } =1, \sigma _{X_5 } =2,\) \(\quad \sigma _{Y_1 } =1, \sigma _{Y_2 } =\sigma _{Y_3 } =\sigma _{Y_4 } =\sigma _{Y_5 } =2\)

  4. (d)

    \(m=n=7, \sigma _{X_1 } =2, \sigma _{X_2 } =1, \sigma _{X_3 } =4, \sigma _{X_4 } =3, \) \(\sigma _{X_5 } =1, \sigma _{X_6 } =5, \sigma _{X_7 } =1,\) \( \quad \sigma _{Y_1 } =3, \sigma _{Y_2 } =2, \sigma _{Y_3 } =1, \sigma _{Y_4 } =2, \sigma _{Y_5 } =3, \) \(\quad \sigma _{Y_6 } =1, \sigma _{Y_7 } =6\)

The histograms of the \(h(X,Y)\)’s values for those variants are presented in Fig. 3. Each of them is based on 100,000 simulations and is quite symmetric which of course results from the symmetric distributions of \(h(X,Y)\). Figure 3 presents only four example histograms, however, the other simulations that were carried out by the author show that the histograms of the \(h(X,Y)\)’s values are always symmetric, hence the symmetry is independent of values of \(m, n, \sigma _{X_j } , \sigma _{Y_i}\).

Fig. 3
figure 3

Histograms of Monte Carlo simulations of \(h(X,Y)\)’s values for the four example variants ad which are detailed described in the text (\(\omega \) is the frequency)