Introduction

The main goal of the work is to create an automatic method of locating weak zones within flood embankments structure based on GPR measurements.

The GPR method in research on flood embankments has been extensively described in the scientific literature (Gołębiowski et al. 2018; Gołębiowski and Małysa 2018; Anchuela and Pueyo 2018; Łój et al. 2018; Perri et al. 2014; Gołębiowski et al. 2012; Słowik 2011; Xu et al. 2010; Di Prinzio 2010; Mori 2009; Marcak et al. 2005); however, there are still many unresolved problems that can be solved by scientists improving further on the existing approaches which is the focus of this paper.

Essentially, advance statistical analysis in the form of artificial neural network (ANN) operation was performed on the measured data and its attributes in order to enhance its better interpretation. A special method using ANN was developed for the detection and location of anomalies. The theory of this method is presented in Tadeusiewicz (2015) and Szymczyk et al. (2014, 2015a, b).

One of the main problems that arise with prospective flooding after a heavy rainfall is the strength of embankments to withstand the flood. When the high flood water comes, the only hope is that the embankments will withstand and the water will not overflow the cultivated fields and human settlements that lie behind them.

Most of the existing embankments in Poland were built many years ago. They have already confirmed their usefulness during floods that have occurred in the past, but this fact does not guarantee that the embankments will prove effective also during the floods that will come in the future. The external inspection of the embankment does not reveal the greatest threats, which are potentially zones of weakness inside the embankment. The origin of these zones may be different—for example, there are burrows of animals hollowed inside embankment and weakening its structure, erosional effects, and sometimes also errors made during the construction of the embankment.

It is worth emphasizing that the zones of loosening often occur very low in the structure of the embankment, sometimes even close to the foot of the embankment, so when the water of the flood begins to wade through them, the typical methods of flood defenses, consisting in laying sand bags on the shaft’s crown, are ineffective. It is because the water goes to the bottom, destroying the structure of the embankment’s interior at the same time.

The most important factor for analysis of the possibility of the anomalies in the structure of the embankment detection using electrical and electromagnetic techniques is the assessment of alteration of the electromagnetic properties of the ground caused by the presence of weak zones. Of utmost importance are the relative dielectric permittivity εr [–], electrical conductivity σ [S/m] and a seemingly negligible magnetic permittivity µr [–] which is usually taken as 1.

The material most often used for the construction of levees is clay sands and sandy dust. Due to the high attenuation of the GPR signal, resulting from high conductivity of clay material, the results of GPR measurements taken on flood embankments are usually difficult to interpret. This article aims to present the methodology of GPR signal processing which allows the separation of anomalous zones within flood embankments.

To detect anomalies in the structure of the embankment, it was decided to use ground penetrating radar (GPR), which can reveal anomalies in the structure of the embankments.

Localization of study area

The GPR measurements were taken on Rudawa River embankments in Wola Justowska—a suburb of Krakow on the street Becka along few profiles (Fig. 1). Today, it is known that Rudawa has changed its course. The direction of the current river bed is straight due to the adjustment as a result of the embankment (Łanczont et al. 2015). This scenario contradicts the situation in the past particularly before the XIX century when the river meandered and caused flooding. The upland within this zone served as the ground for the burial of dead people after outbreak of cholera epidemics.

Fig. 1
figure 1

a View of the plain of Rudawa River embankments where measurements were made (profile 1). bArtistic impression of the Rudawa River landscape after Stanisław Wyspiański

During the construction of the embankments, the Rudawa’s old river beds were dug. Until now, the course of the Rudawa River bed could be determined based on descriptions or based on old photographs (Fig. 1b).

Method of study

The ProEx GPR system manufactured by MALA Geoscience (now ABEM/MALA) was used with 250-MHz shielded antenna with mean resolution of approximately 0.1 m, and its maximum depth penetration is approximately 10 m which allows to visualize the loose zones in Rudawa River embankment. Exemplary radargram for one of the profiles of the measured data no. 1 is presented in Fig. 2.

Fig. 2.
figure 2

Radargram of one of the profiles (no. 1) of flood embankments in the vicinity of the Rudawa River with the distribution of the anomalies of the loosed zones

All radargrams were processed using ReflexW program (Sandmeier 2012) with the following procedures (Annan 1999): phase correction, time zero correction, amplitude declipping, dewowing, DC shift, gain, background removal, Butterworth filter, deconvolution, smoothing.

For the time–depth conversion, mean velocity v = 0.0754 m/ns was assumed on the basis of WARR profiling carried out in the studied area (Fig. 3).

Fig. 3.
figure 3

WARR profile in the area of study

The zones of weakness in the embankment, regardless of whether they are dry or filled with water, caused an increase in reflectivity. It means that on radargram we have larger amplitudes of registered signal. Randomly distributed high-amplitude anomalies probably indicate weak zones. These zones are prone to seepage or pathway for the river course in the nearest future.

Standard method of interpretation of GPR, which involves visualization, may not be effective for recognizing the weak zones that are prone to seepage or pathway for water. Hence, attribute computation of the GPR signal may enhance the identification of the inherent properties of the embankment materials. Moreover, output of the attributes may serve as valuable input in the use of artificial neural network.

Although the displayed information on the processed radargram may be visible and provide interpretative deduction, this is subjective to different interpreters. Then, to overcome the subjectivity, advance processing in the form of signal analysis (attributes) were made to provide information that may not be discernable and subjective. Output of this analysis may serve as input in the use of neural networks for better interpretation.

Examination of embankments with the use of GPR data requires the use of many techniques. In the present time, there are a lot of possibilities. However, we can replace human eyes with more precise methods of artificial intelligence. This is necessary because the anomalies recorded on radiograms are not always legible enough to read only with the eyes.

The signal analysis was realized by using OpendTect software (DGB Beheer BV 2019). OpendTect is a waveform data interpretation software system for post-processing, visualizing and interpreting data for fast-track development of innovative interpretation tools.

In a task of classification or recognition of patterns essential steps of analysis and removal of natural redundancy of source signal which allows for better identification through the use of statistical analysis is essential. Thus, a semantic, adequate and objective interpretation is achieved. The above-described process is known as artificial neural network.

To engage the use of ANN, input data are required as the entry data. In this situation, we used the attributes of GPR signals calculated from the field data.

Attributes include any quantitative transformation of data made from GPR survey. The application of this attribute allowed for an improved interpretation of the GPR data.

The following signal attributes were calculated using OpendTect: the Energy attribute, the instantaneous frequency, similarity, attributes in the group of texture that return statistical properties of a grey-level co-occurrence matrix (GLCM), curvature gradient, Grubbs filter threshold, Convolve Prewitt, dominant frequency, steepness, absorption quality factor

The Energy attribute calculates the squared sum of the sample values in the specified time-gate divided by the number of samples in the gate. The Energy is a measure of reflectivity in the specified time-gate. The higher the Energy, the higher the amplitude. This attribute enhances, among others, lateral variations within events.

The signal Energy in the signal x(t) is given by:

$$E = \mathop \int \limits_{ - \infty }^{\infty } \left| {x\left( t \right)} \right|^{2} {\text{d}}t.$$

Energy attribute is used in GPR method for a wide range of purposes; examples of usage can be found in Gołębiowski et al. (2018) and Zhao et al. (2013).

In the case of our research, Energy attribute highlights the maximum amplitudes of the reflected wave signal that can be correlated with the loosed zones (Fig. 4) (zone with yellow coloration).

Fig. 4.
figure 4

The Energy attribute highlights the maximum amplitudes of the reflected wave signal that can be correlated with the loosed zones (yellow areas)

Instantaneous attribute—attribute that returns a value at a single sample location, is computed sample by sample and represents instantaneous variations of various parameters. Instantaneous values of attributes such as trace envelope, its derivatives, frequency and phase may be determined from complex traces (Taner et al. 1979; Taner 2001).

The instantaneous frequency attribute responds to both wave propagation effects and depositional characteristics, and hence, it is a physical attribute which can be used as an effective discriminator.

It uses include: fracture zone indication, since fractures may appear as lower frequency zones and Bed thickness indicator. Higher frequencies indicate sharp interfaces such as exhibited by thinly laminated shales, lower frequencies are indicative of more massive bedding geometries, e.g., sand-prone lithologies (Nemytova et al. 2012).

Instantaneous frequency (Fig. 5) depends on the geometrical layout of the reflex and on the interference. It does not depend on the signal Energy. In our case, it enhances interface between two GPR facies.

Fig. 5.
figure 5

Instantaneous frequency computed for profile 1. It enhances interface between two GPR facies

Examples of application of the aforementioned attribute for GPR prospecting can be found in Bradford and Wu (2007).

Similarity is a form of coherency that expresses how much two or more trace segments look like. A similarity of 1 means the trace segments are completely identical in waveform and amplitude.

A similarity of 0 means they are completely dissimilar. It is used to visualize abrupt pinch-out, erosional incisions and lateral variable lithofacies.

Similarity attribute can be used to emphasize the variability of neighboring data (Boniger and Tronicke 2012) and thus allows to strengthen the edges naturally and artificially generated degradation zones within embankment structure. Theory and applications of similarity for seismic data are described by Chopra (2007) and Chopra and Alexeev (2005).

The result of the similarity attribute (Fig. 6) shows continuous coherent reflections depicted by the white patches, which could be correlated with highly saturated soil or the appearance of stratification

Fig. 6.
figure 6

Similarity computed for profile 1. Continuous coherent reflections depicted by the white patches, which could be correlated with highly saturated soil or the appearance of stratification

Attributes in the group of texture that return statistical properties of a grey-level co-occurrence matrix (GLCM) (Chopra and Alexeev 2005; Hall-Beyer 2012) characterize the texture of an image by calculating how often pairs of pixel with specific values and in a specified spatial relationship occur in an image, creating a GLCM, and then extracting statistical measures from this matrix. GLCM texture considers the relation between two pixels at a time, called the reference and the neighbor pixel. The attribute response is calculated in two steps: First, the GLCM is computed for an area (volume) around the evaluation point. Secondly, a statistical property from the GLCM is returned.

In the literature, we can find few examples of using GLCM for GPR signal enhancement (McClymont et al. 2008; Catakli et al. 2011; Zhao et al. 2016).

GLCM attribute is used in facies analysis. It can be done in both supervised and unsupervised approaches.

The GLCM texture analysis (Fig. 7) confirms that the geological compositions in which the embankment was made are of different materials.

Fig. 7.
figure 7

GLCM mean computed for profile 1

An integrated curvature and curvature gradient analysis suggests that curvature might help define areas of enhanced potential to form tensile fractures, whereas curvature gradient might help define zones of enhanced potential to develop shear fractures (Gao 2013). It can be potentially applied to differentiate fracture mode, to predict fracture intensity and orientation, to evaluate fracture volume and connectivity and to model fracture networks. The most commonly applied of these are measurements of curvature.

Curvature gradient might help define zones of enhanced potential to develop shear fractures. It depicts the geometry of reflectors in the subsurface.

Curvature is also defined as the inverse of the radius of the circle that is not only tangent to the surface but that also fits the surface (Rich 2008; Rich and Marfurt 2013).

The curvature gradient has shown anomalies that is suggestive of the oxbows which existed in this area many years ago (marked by dashed yellow circles in Fig. 8). These oxbows are thought to have been backfilled with different materials with different provenance. These findings can contribute toward reconstruction of the paleo-environment in which Krakow was made.

Fig. 8.
figure 8

Curvature gradient computed for profile 1

In this attribute, traces of the oxbows of the Rudawa River were well delineated than as shown in the field data. Zones have been revealed that may indicate the earlier course of the Rudawa River bed (Fig. 8).

Grubbs filter threshold is the attribute that removes outliers from normally distributed data. It is a statistical test used to detect outliers in a univariate data set assumed to come from a normally distributed population. It is based on the assumption of normality.

The computed information from the Grubbs filter threshold has enhanced the elimination of random noise that may have obliterated the actual response. As in the case of curvature gradient, oxbows are enhanced by Grubbs filter and marked in Fig. 9.

Fig. 9.
figure 9

Grubbs filter threshold for profile 1. Anomalies probably connected with oxbows are marked by dashed yellow lines

Convolve Prewitt is the attribute that returns a filtered response. The way it works is that input data are convolved with a three-dimensional kernel specified by filter type and associated parameters. Application of convolutional Prewitt operator for GPR data is presented by Peng and Zhao (2013) and Boniger and Tronicke (2012). In this article, we have used the Prewitt filter that emphasizes the expected features of loosed zones (Fig. 10).

Fig. 10.
figure 10

Convolve Prewitt for profile 1

Dominant frequency is usually estimated by counting the number of relative maxima within some interval. These estimates depend on the phase of the data and tend to be higher than the zero-crossing frequency (Barnes 1993).

Dominant frequency (Fig. 11) shows us zones of smaller or larger interference, horizontal changes in lithology. Significant changes in the value of this attribute indicate transition from low-conductivity material to high-conductivity zones.

Fig. 11.
figure 11

Dominant frequency for profile 1. Significant changes in the value of this attribute indicate transition from low-conductivity material to high-conductivity zones

Neural networks

After characterizing the tested signal by using the above-described attributes, an attempt was made to support the process of its interpretation and possibly also automatic classification using one of the artificial intelligence tools serving in computer-aided decision-making process. In this work, a tried-and-tested tool was chosen which is an artificial neural network.

Neural network can be taught with a teacher or can learn by itself. Self-learning neural networks (unsupervised) (Tadeusiewicz 2015; Mazurkiewicz et al. 2016) can be used for the initial or fully automatic, fast, low-cost classification of anomalous zones. However, as mentioned above, it is important that appropriate selection of attributes is made, due to the completely automatic learning scheme. After analysing a wide range of attributes provided by OpendTect software, we choose those presented earlier in this report as the best indicators of weakened zones in embankment structure, and then, we use them as input to unsupervised neural networks. Additionally, set of randomly distributed points within the areas considered in the radargram were picked. Results of unsupervised neural networks classification are shown in Fig. 15. In the example, classification was made for four classes. The first class (blue) probably indicates high-frequency noise zones that have not been eliminated by processing. The second class (green) is connected with a stable part of embankments. The fourth class (pink) is probably connected with weak zones within embankment structure. We can see that main weak zone is located between 40 and 60 m of the profile. Figure 16 shows certainty of classification.

The role of neuron networks taught “without supervisor” method based on grouping input data in such a way that groups of data are mutually similar to each other (clusters), while data groups that are little similar to each other or completely unlike each other are separated.

Because this is done using the unsupervised learning method, the creator of the network has no influence on how these groups are formed.

Neural network reveals the structure of the analyzed data, but does not penetrate into any causal relationship.

After obtaining information about clusters detected by the network, it is necessary to analyze selected examples of data from individual clusters and interpret them (for example, GPR signals from embankments weakened zones).

The important thing is that the self-learning network (usually Kohonen’s network is used here) will detect and indicate in a multi-dimensional attributes space of the grouped data, which can then give correct interpretation.

Neural network taught with a teacher (supervised) is the most commonly used method of neural network as an automatic classifier. In this area, the most often used is networks with feedforward of MLP type (multi-layer perceptron), which can be learnt using backpropagation method. An example of its use in a similar task of classification can be found in Haduch and Tadeusiewicz (2018).

This more advanced method is necessary to define one set of peaks in the zones that we consider to threaten the stability and resistance of the embankments and the second set of peaks outside these zones. Then, these sets were used as the so-called learning sets.

As the input for supervised neural networks classification, we use the same attributes as for unsupervised one. Additionally, we had to define two sets of peaks. The first one was picked in the area of zone of high values of Energy attribute (40–60 m of profile; 15–40 ns), which we considered as loosen zone. The second set of peaks was executed outside this zone. Results of classification gives two classes. The first one (pink areas in Fig. 12) shows probable distribution of weakened zones within embankment. Figures 13 and 15 show the similarity of matching the first and second classes.

Fig. 12.
figure 12

Supervised classification results

Fig. 13.
figure 13

Supervised classification—classification matching for weak-Energy zones

Teaching neural networks by “teaching with a teacher” method (supervised learning) is a very effective method when we have a large and correctly defined training set; it means a set of input data for which the correct decision is known and can be considered as certain. This technique has found applicability in other fields of study such as in financial consulting or medical diagnostics described in the literature (Sasiada et al. 2017). However, in the problem of assessing the “condition” of flood embankments considered in this work, such model training data were not existing because no information on the soil section of the surveyed embankments was known, where points of the existing damage and structural changes (considered in the work) were confirmed, for example, by drilling results (there were not detailed information from geological wells and points of the geotechnical investigations). For this reason, neural networks taught by the supervised learning method did not give sufficiently unambiguous results (Figs. 12, 14).

Fig. 14.
figure 14

Supervised classification—classification matching for high-Energy zones

It should be noted that if we know with great probability the location of the weakened zone by performing quilting in it and teaching the neural network that it is a model, we can quite surely classify zones with similar features within the entire echogram.

In this situation, an attempt was made to use the network of taught without a teacher to solve the considered problem (unsupervised learning). Those networks analyze the internal self-similarity of data by detecting data that deviate from the majority. This method can be used to detect any anomalies in the data, so it can allow detection of zones in the embankments, whose internal structure is different from the average—and this may be a sought-after place for internal destruction of the embankment structure.

The results of GPR signal analysis obtained using the network taught “with the teacher” and “without the teacher” are presented in Figs. 15 and 16.

Fig. 15.
figure 15

Unsupervised classification—4 classes

Fig. 16.
figure 16

Unsupervised classification—matching of classification

The results obtained using two different classifications methods are not exactly identical but converge each other. The zones that we associate with the weakening of the shaft structure are located in similar places for both methods. The discrepancy between classification results may be due to the method peak set definition for the supervised method. Peaks have been defined in a wide area of a high-Energy zone, which may not be in fact weakened zone in its entire area, by which it can cause that part of the samples that are not actually associated with the weakened zone are classified into it. In the considered case, the classification method without a teacher gave the result more precisely indicating the location of the weakened zone.

Conclusions

In this research, the use of GPR to monitor and predict the strength of river embankment was considered. The results has shown the usefulness of GPR geophysical method through integration of ANN in the analysis of the data to characterize the materials of the embankment. Thus, weak zones within the embankment structure that are prone to encroachment by water are easily identified.

As it was shown in the paper, GPR technique has pointed out the continuous information about condition of embankments between the points where control technical drilling was made. The article specifies the problem of creating a method of analyzing GPR signals aimed at detecting zones of weakness under the flood embankments. Methods of automatic recognition of elements of destruction restricting the strength of embankments were created.

The zones that may indicate the earlier course of the Rudawa River have been revealed. Clear anomalies were recorded that could have originated from the oxbows of the Rudawa River, flooded with different sediments of the oxbows. This statement may be a contribution to the research conducted by Krakow lovers reconstructing its former appearance.

The results of the conducted research confirm the results of previous historical works, perfectly complementing them.

Although the areal extent of this study covers a small section of the studied reality, these sections can be expanded and include large ranges of research.

Generally speaking, it confirms the efficacy about the usefulness of GPR studies in other scientific fields of the study, in particular those that do not belong to numerical sciences.

It should also be remembered that the GPR method applied before the commencement of construction works could prevent accumulation of additional works and costs by determining the usability of the built-in land. It should be noted that appropriate soil investigation is carried out to avoid building on hidden oxbows that may not be visible to the surface. Therefore, the GPR method applied before construction works could prevent accumulation of additional works and costs by determining the usability of built-in land. Most importantly, the relatively low cost of GPR surveys would allow for the appropriate design of future construction works preventing the appearance of serious problems for users of buildings and roads in the future.