Introduction

One of the most important and at the same time most difficult tasks undertaken in the complex process of constructing spatial information systems is the creation of a digital terrain model (DTM), which is the basic information layer used by systems describing phenomena at high-definition levels and provides them with the basis of spatial organisation (Arctur and Zeiler 2004). Contemporary DTM users set high requirements, laying stress both on data quality (accuracy, reliability, state of the art), the dynamics of their processing and visualisation, and the possibilities of analyses in real time (e.g. Brasington and Richards 1998; Borkowski 2012; Maleika 2013).

In order to construct a DTM of the seabed, measurement information has to be gathered first. Modern measurement systems with devices that make it possible to record observations in a continuous and fully automatic way (e.g. multibeam echosounders) enable acquisition of a huge amount of information in a relatively short time. Measurement systems register the location and depth (spatial coordinates) of many million points during one measurement campaign. The processing of such amounts of data, which in addition are mostly irregularly scattered in space, requires the application of specially prepared methods and properly selected processing algorithms. The DTM is usually made on the basis of a GRID structure (a regular net of squares). There are numerous methods of determining GRID based on measurement data, the ones most frequently applied being kriging, minimum curvature, nearest neighbour, natural neighbour, modified Shepard’s method, radial basis function, polynomial regression, inverse distance to a power, triangulation with linear interpolation, moving average, and methods based on artificial intelligence (e.g. Hamilton 1980; Stateczny 2000; Lubczonek and Stateczny 2003; Yang et al. 2004; Gosciewski 2014). These methods make use of a series of differentiated algorithms to establish the values of interpolated parameters at node points. The selection of interpolation method in the case of unevenly distributed measurement data should be determined by a number of features characterising such datasets: the degree of homogeneity of data dispersion, number of points per unit area, population variance (degree of data changeability), and the type of surface reflected by the data (Maleika et al. 2012a).

Within this context, the fact that moving average is one of the most underestimated interpolation methods used for grid data creation evidently poses a challenge to researchers in this domain. Indeed, it gives less precise results compared to other interpolation methods. Nevertheless, this comparison accounts only for the error rate.

As shown by earlier studies cited above, the moving average method is one of the fastest known for grid interpolation (up to 10–20 times faster than the other commonly used methods). In terms of the criteria of time costs as well as error rate, therefore, it is a very good alternative to other slower methods. In this study, an improved variant of the moving average method is proposed, based on a novel approach in searching for nodes that are taken into account while computing a new node value.

Basics of moving average method

The moving average method is one of the interpolation methods that assigns a weighted average of surrounding points to the output point. The weights computation is based on a specific function, which is usually a distance. When points are closer to the output point, they have more influence on its value. Function is implemented in such a way as to ignore further points to speed up the computation time. The algorithm can be described by two main steps.

  1. Step 1.

    For all output points, distances to all surrounding points are calculated to determine their weights. There are two main distance metrics that are used in this case:

    $$ \mathrm{inverse}\ \mathrm{distance}:\mathrm{weight}=\frac{1}{d^n}-1 $$
    (1)
    $$ \mathrm{linear}\ \mathrm{decrease}:\mathrm{weight}=1-{d}^n $$
    (2)

    where d is a relative distance between a point and output point (calculated as \( \frac{D}{D_0} \), where D 0 is an Euclidean distance, and D a limiting distance, i.e. the radius of a circle delimiting the search area), and n is the weight exponent.

  2. Step 2.

    For all output points, values are calculated as the sum of the products of weights and points values divided by the sum of weights, also called weighted average:

    $$ \mathrm{output}\ \mathrm{value}=\frac{{\displaystyle \sum \left({w}_i\cdot {\mathrm{val}}_i\right)}}{{\displaystyle \sum {w}_i}} $$
    (3)

    where w i is a weight for point i, and val i is a value of point i.

    In most GIS software (e.g. Surfer 8.0), the moving average method is implemented so as to assign values to points by averaging the data inside a fixed search area (defined by the user). There is also an option to define a minimum number of neighbours inside the search area, which enables calculating a point value. Otherwise, the point node is set as a blank.

Research

The first stage of research comprised searching for an optimal method of establishing the number of measurement points utilized in the moving average interpolation process. The influence of two factors was examined: the minimum number of points, and the search radius taken into consideration while calculating new node values.

Test surfaces

The research was done using three surfaces prepared from real bathymetric data collected by the Szczecin Maritime Office, Poland. The surfaces differ strongly in their morphology (see Fig. 1):

Fig. 1
figure 1

Test surfaces: a anchorage site, b swinging area, c submerged car wrecks

  • Anchorage: almost flat area of seabed,

  • Swinging area: varied surface with many holes and slopes,

  • Wrecks: almost flat surface with five car wrecks.

For each of the surfaces listed above, a grid model was built. Their sizes are presented in Table 1.

Table 1 Sizes of tested surfaces

Testing procedure

The purpose of this experiment was primarily to take a moving average method with the assistance of SURFER v8.0 software in order to create DTMs of all surfaces. To test the moving average method effectiveness, several directions were explored. First of all, the optimal radius of the search area was determined, based on the assumption that a circle provided a satisfactory delimitation of the search area. The radius experimental values examined were 0.1 to 1 m with 0.1 m steps. Next, experiments were performed with the optimal radius set. As a default value of the minimum number of surrounding points inside the search area, 4 was used. Moreover, the influence of the minimum number of surrounding points was tested.

Filtering is performed by application of the following masks onto the grid data. The main idea of smoothing filters applications is to deal with data noise. The random noise is the outcome of random measurement errors (in the present case, about 5 cm at 10 m depth). Furthermore, the impact of smoothing filter on the resulting grid was examined. In Surfer 8.0 there are several smoothing methods provided. In the present study, it was decided to focus on three smoothing methods:

  • Gaussian with mask 3×3 (mask 1),

  • 5-Node + averaging 3×3 (mask 2),

  • Median filter 3×3.

All experimental results were compared to the model grids that were created from data gathered heeding high-precision acquisition criteria.

Test results

The experiments were performed with several combinations of parameters, such as:

  • Radius (m): 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1;

  • Points required: 1, 2, 3, 4, 5;

  • Smoothing method: no smoothing, Gaussian 3×3, 5-node + averaging 3×3, median filter.

In the first part of research, the analysis of the search radius influence on the resulting grid was done. All tests were performed with the value of required points inside the search area set to 4. The results for the three test surfaces are presented in Figs. 2, 3 and 4.

Fig. 2
figure 2

Results of anchorage surface with different search radius values

Fig. 3
figure 3

Results of swinging area surface with different search radius values

Fig. 4
figure 4

Results of wrecks surface with different search radius values

As can be seen in Fig. 2 (surface anchorage), the 95% confidence level of error rate decreases from 0.045 m to 0.015 m. The percentage of blanked nodes drops from 80% to 0%. Because the goal was a minimal value of 95% confidence level, the chosen value of the search radius is set at 1 m.

Figure 3 (surface swinging area) shows that overall the 95% confidence level of error rate decreases from 0.045 m to 0.023 m, and that it increases slightly for search radius values greater than 0.8 m. The percentage of blanked nodes decreases sharply at values between 0.2 and 0.3 m. Because the goal was a minimal value of 95% confidence level, the value of optimal search radius for this surface is set at 0.8 m.

Figure 4 (surface wrecks) reveals that the 95% confidence level of error rate decreases from 0.4 m to 0.05 m, and then increases to 0.19 m. The percentage of blanked nodes decreases from almost 48% at 0.1 m to stabilise at 0%. Consequently, the value of optimal search radius for this surface is set at 0.4 m.

It is a hard task to determine the optimal search radius value for all surfaces that vary in their features. Figure 5 presents the average value of 95% confidence level for all surfaces with different search radius values (minimum points required was set to 4). It can be seen that the best values of search radius, i.e. those with the smallest error rates, are 0.4 to 0.5 m.

Fig. 5
figure 5

Results of all three surfaces with different search radius values

In the second part of this research, the influence of the minimum required number of points to compute the node value was analysed. The best-performing search radius size (based on all surfaces) was used for further testing (0.4 m). As can be seen in Fig. 6, the value of blanked nodes only slightly depends on the search radius. Moreover, three points required inside the search area proved to be the optimal value for all tested surfaces (it should be noted that, at this stage of research, the accuracy of the created model was not examined, only the number of blank nodes that are undesirable).

Fig. 6
figure 6

Results of all three surfaces with different numbers of points required inside the search area

The third and last part of research deals with the usage of smoothing filters. The tests were performed with the best value of search radius as well as points required inside the search area. The tests showed that, of the three smoothing filters considered, the best method of smoothing for the tested surfaces are Gaussian and median.

Table 2 presents the most effective combinations of parameters, ordered by the average of 95% confidence levels for all surfaces analysed. The value of error rate for the wrecks surface has the strongest influence on variants ordering, because of its relatively large value. The moving average variant with optimal number of points required as well as optimal search radius with Gaussian smoothing applied classifies as the best moving average variant. It can be seen that this variant is not optimal for all surfaces but that it gives the best result overall. There is a substantial difference between all surfaces, such that the 95% confidence level value is spread between 0.016 and 0.108 m.

Table 2 Best variants of moving average interpolation for all tested surfaces ordered in terms of the average error level

Table 3 shows the best variants of the moving average method for each of the tested surfaces. For all of these, the number of points required inside the search area is set to 1. All cases with applied smoothing filters (Gaussian or median) had the best grids. For all three surfaces (swinging area, wrecks and anchorage), the best variant of moving average resulted without blanked nodes, so that the output grid was complete. These results were compared to those obtained using kriging and inverse distance interpolation methods (for the same surfaces).

Table 3 Best variants of moving average for each of the tested surfaces

Table 3 also presents the best parameter combination for each method. There are two surfaces for which the 95% confidence level value is relatively small—the wrecks and the anchorage. This can be explained by their smooth (flat) topography. Thus, the more complex the surface, the bigger is the error of the moving average method.

In order to avoid distant points having an influence on the output value of the calculated node, only radius values smaller than 1 m were employed. This prevents the situation of less significant nodes having an influence on output node values.

The computational effectiveness of particular interpolation methods (especially depending on the method applied: moving average, kriging, and inverse distance) is not discussed in detail in this paper, but it is worth noting that the moving average method is approx. 10–30 times faster than the two other methods. More details regarding the speed and accuracy of interpolation of such data can be found in Maleika et al. (2012b).

Based on broader research carried out by the author (not all of which is presented in this paper), the following can be noted:

  • increasing the search radius beyond 1 m results in larger errors;

  • too many measurement points considered during interpolation also results in larger errors;

  • the best results are obtained for approx. 2–5 measurement points lying in closest proximity.

Based on the above, the working hypothesis is that the moving average method can be further optimized by eliminating the fixed size of the search radius and focusing instead on several closest measurement points (the operator specifies the number of measurement points required and their maximum distance from the node being calculated).

Improved moving average variant: the growing radius

The algorithm of the improved moving average variant incorporates:

  • the minimum number of points used for calculations, P;

  • the start radius, R;

  • the maximum radius, R MAX.

Consequently, points inside the search area are ordered according to the squared radius (see Fig. 7a). Then, the closest P points inside the search area determined by radius R are chosen (Fig. 7b). If the number of points inside the search area is greater than or equal to P, then perform moving average (MA) interpolation by use of P nearest points. Otherwise, increase the R as long as R <= R MAX or the number of points required is greater than / equal to P, perform MA interpolation by use of these points (Fig. 7c), otherwise set the blank node. The pseudocode of the improved moving average variant is provided in Fig. 8.

Fig. 7
figure 7

Illustration of improved moving average performance: a search radius is set as minimum; b amount of points in search radius is too low; c search radius is enlarged to contain a certain amount of points

Fig. 8
figure 8

Pseudocode of improved moving average variant

Testing the new method

The examination of the usefulness of the proposed method consisted in calculating the accuracy of test models created using the modified moving average method and comparing the results to the ones obtained before. The series of interpolation tests were performed with the following parameters: P=1 to 5 points, R MIN=0.1 m, R MAX=1 m and R STEP=0.1 m, smoothing method set to Gaussian. The obtained results are presented in Table 4.

Table 4 Comparison of the modified moving average (MA) and the regular MA method (95% error confidence level)

Examining the data in Table 4 reveals that:

  • the best results (smallest error) were obtained when the closest 3 or 4 points were taken for interpolation;

  • the flatter the surface, the smaller is the minimum number of points (for the anchorage, 1–3 and, for the wrecks, 3);

  • the dimensions of the frame for which the best results were obtained were approx. 0.3–0.6 m;

  • the depth accuracy in the created models (using the improved moving average variant) is higher by approx. 10%.

It is worth noting that utilising the proposed interpolation method slightly speeds up the interpolation process in comparison to traditional interpolation methods (fixed search radius, but big enough to encompass a sufficient number of points). It can be stated that the size of the frame adapts to the density of measurement points (which can vary not only for data coming from different surveys but also within the process of creating a model of one and the same surface). In the presented method it is the operator who decides how many points are considered while calculating a new node. The operator also defines the conditions for creating a blank node (R MAX).

Outlook

The paper presents research on moving average variants for application in digital terrain model creation. Because the method performed well on variously shaped surfaces, the experiments convincingly demonstrate the high potential of the employed variants. Indeed, the moving average method performed very satisfactorily for all tested surfaces with a 95% confidence level of error less than 0.07 m. This meets the IHO norms that the error rate of interpolated grids has to be less than about 20 cm (IHO 1998).

The proposed improvement to the moving average method expands our usage of the classic moving average approach. As demonstrated by experiments, its application increases the accuracy of resulting grids and better fits the real seabed surface.

The proposed improved moving average module—the “growing radius”—can be integrated into GIS software for modelling surfaces based on huge amounts of measurement data from, for example, multibeam echosounding and, for that matter, other technologies such as aerial and LIDAR as well as structure-from-motion photogrammetry (e.g. Fraile-Jurado and Ojeda-Zújar 2013; Ružić et al. 2014).