1 Introduction

Wetlands are ecosystems that are permanently or seasonally flooded by water. Peatlands also belong to the generic term wetlands. Ecosystems whose organic soils are more than 30 cm thick and contain more than 30% organic material are often referred to as peatlands (Joosten and Clarke 2002). A uniform definition does not exist. The terms peatland and wetland are often used synonymously in Germany, as mainly they are included in the land use category “wetlands” (UBA 2021). The peatland ecosystem is a very important natural carbon sink. Carbon is sequestered from the air through plant growth and permanently stored through the formation of peat. The rate at which atmospheric carbon is absorbed and released from the peat is directly dependent on the hydrology of the ecosystem (Succow and Joosten 2012). Only a water-saturated peatland can store carbon over long periods (Succow and Joosten 2012).

But many peatlands are drained and return carbon to the atmosphere in the form of greenhouse gases. In Germany, 91.9% of organic soils are drained and emit about 48 million tonnes of CO\(_2\) equivalents per year, which corresponds to about 6% of total emissions in Germany (UBA 2021). The global emission from drained organic soils were nearly one billion tonnes CO\(_2\) equivalents annually (Tubiello et al. 2016). Other important ecosystem services, such as flood control and providing habitat for rare animal and plant species, are also lost through drainage. Drainages lead to a lowering of the water level in the area. This is necessary for land-use such as agriculture or pasture. Drainage measures can also lead to the drying out of permanently open water areas in a peatland. In the course of rewetting, which is currently taking place on more and more peatlands, the drainage measures are being reversed. For deeply drained grassland, an annual saving of 17 tonnes of CO\(_2\) equivalent per ha can be achieved through rewetting (Wilson et al. 2016). If rewetting is successful, the groundwater level rises. Usually the water table depth (WTD) is measured by wells, which have to be checked individually and regularly.

Our work looks at two sub-aspects of hydrology: permanently open water areas and the groundwater level. To monitor the rewetting of peatlands, the WTD and the lake level are often measured ground-based, as in our case study. It is assumed that when peatlands dry out, WTD and lake level decrease, or when peatlands get wetter, WTD and lake level increase. However, the collection of in situ data is costly and not always possible. In our work we therefore explore to what extent remote sensing data can replace the measurement of in situ data. To this end, we are investigating the relationship between in situ and remote sensing data. Measuring the lake levels by altimeter is generally possible, but not suitable for small water areas like the lake in our study area. So, instead of measuring the lake level, we decided to measure the lake area. We assume that the lake area increases as the lake level increases and that there is a direct correlation between these two variables.

In the course of flood monitoring, radar data are often used for water area detection with moderate to good accuracy: Cazals et al. (2016) indicates that open water can be detected successfully, flooded grassland with moderate accuracy. Martinis et al. (2015) reach a high user accuracy, but in mountainous region the accuracy decreases, because of a lot of water look alikes caused by radar shadowing effects. Pulvirenti et al. (2011) also see problems with thresholding due to roughening effects caused by wind speed. Tsyganskaya et al. (2018) use Sentinel-1 synthetic aperture radar (SAR) time series data to classify temporary open water and temporary flooded vegetation. One advantage of using radar data is the independence from cloud cover, which often occurs simultaneously with flood events (Boni et al. 2016). Water surface detection also finds application in land use land cover (LULC) detection. Bagwan and Sopan Gavali (2021) and Haque and Basak (2017) achieved high accuracy in detecting water bodies by using Landsat data. However, a detailed consideration of the lakeshore region and the boundaries to other areas is not presented. Huang et al. (2018) present in their review the growing need to map surface water at a global scale and discuss the problem of thresholding as the most critical issue in using water indices. Another issue Huang et al. (2018) and Liao et al. (2014) discuss is that vegetation cover tends to dominate the spectral signal, so flooding water under vegetation is difficult to detect. Liu et al. (2019) analyse the shadow effect in optical data in urban centres, which were mostly misclassified as water, when only optical images are used. They investigate a method for LULC mapping with optical and SAR data and reach high user accuracy in detecting water surfaces. In the special consideration of wetlands the focus is on monitoring landscape changes (Muro et al. 2016), distinguishing wetland areas from other land covers (Dong et al. 2014; Kaplan and Avdan 2018) and mapping of vegetation types (Mleczko and Mróz 2018) and water bodies (Huang and Jin 2019; Mleczko and Mróz 2018; Moser et al. 2016). White et al. (2015) review techniques to demonstrate SAR capabilities for wetland monitoring and recommend SAR as the primary source of imagery, supported by other data such as lidar, thermal, and optical imagery. Guo et al. (2017) concluded in their review of wetland remote sensing that multi-source integration should be the trend of future wetland remote sensing. Kaplan et al. (2019) show the strong relation between Landsat 8 NDWI and Sentinel-1 data in wetlands.

Previous studies are successfully showing the connection between SAR backscatter and groundwater levels. The radar signal penetrates the upper soil layer and the backscatter is influenced by soil moisture (Baghdadi and Zribi 2016). Soil moisture is related to groundwater via capillary forces so there is a connection between radar backscatter and WTD (Asmuß et al. 2019). Kim et al. (2017) analysed the hydraulic change in a peatland by comparing averaged SAR intensity from Radarsat-1 and ALOS PALSAR with groundwater level changes. They found that the SAR backscattering is significantly responsive. Bechtold et al. (2018) use ENVISAT ASAR C-band backscatter data and obtain moderate correlation coefficients. They conclude that backscatter is a good indicator for WTD dynamics with a strong capillary connection between WTD and the C-band-sensitive top 1–2 cm of peat soils, but the interpretation seems to be more difficult for natural than for drained peatlands. They see potential in the use of finer resolution SAR data. Asmuß et al. (2019) evaluated the correlation between high spatial resolution Sentinel-1 \(\sigma ^0\) and WTD in drained organic soils. All the mentioned publications discuss the influence of vegetation. Bryant and Baird (2003) and Harris and Bryant (2009) show in their analysis of sphagnum mosses the existence of strong relationships between near-surface hydrology and the spectral behaviour of vegetation. Meingast et al. (2014) find strong correlation between water table positions and various spectral indices.

In general, remote sensing offers great potential for environmental and climate protection because solutions can be applied worldwide, long time periods can be studied, and the methods are very cost-effective and less labour-intensive compared to in situ measurements. Equally important is the contactless measuring method of remote sensing, which is ideal for observing protected ecosystems.

We use the concept of optical and SAR data fusion as it is used for LULC detection with good results to overcome problems like radar and optical shadowing effects. When monitoring peatlands, radar data have the additional advantage of penetrating the vegetation cover, but the penetration depends on the plant height and the density of the leaf area. Thus, the lake size could be determined independently of the vegetation growing on the water surface. The modified normalized difference water index (MNDWI) is used for lake size measurement, because MNDWI can obtain great water body separation precision and, according to Bhunia (2021), has more accurate open water knowledge than other spectral indices like automated Water Extraction Index (AWEI), Water Ratio Index (WRI) and New Water Index (NWI). The radar data allows us to investigate the moisture in the upper soil layers. To analyse the relationship of the remote sensing data to the WTD, we decided to use the multi-spectral index NDWI in addition to C-band SAR data, because it shows strong correlations with peat moisture at 3 cm depth (Meingast et al. 2014), which is approximately the penetration depth of the radar signal.

The aim of this study is to develop methods that help to monitor the renaturation of peatlands and to demonstrate the potentials and limitations of these methods using freely available Sentinel-1 and Sentinel-2 data. In this work we explore the following research questions:

  • Is the MNDWI or the SAR backscatter better suited for measuring the lake size and which data have the highest correlation with the lake level?

  • Is the fusion of the data via manually weighted features or via the random forest algorithm more suitable for determining the lake area?

  • Can we find correlations between SAR backscatter or NDWI and the WTD and what are the limiting factors here?

  • Is there a relation between the lake size and the lake level with regard to mapping them from Sentinel-data?

2 Study Area

Fig. 1
figure 1

Study area peatland Federseeried with lake Federsee, located in Baden-Württemberg, Germany. Location of monitoring wells for WTD measurement (blue) and location of station for lake level measurement (red)

The peatland Federseeried in Baden-Württemberg, Germany, has an area of over 30 km² and has been under nature conservation since 1939 (LGRB 2021). The peatland was chosen as the study area because the WTD is monitored at about 85 observation wells (Fig. 1). The WTD’s measurements in Federseeried began in 2005 with individual monitoring sites, which increased steadily to monitor the rewetted areas. The peat body, which has organic silt underneath, varies greatly in his thickness. The northern Federseeried is of particular interest. These meadows were drained in the nineteenth century by an elaborate canal system for agricultural use. In summer 2014, construction projects were completed that should lead to the rewetting of the peatland. The potential of the northern Federseeried for our project lies in the high density of observation wells (approx. 43) and the fact that there is low vegetation. Which means that there is a prospect of good measurement results from the SAR backscatter. The management of the peatland is described as extensive. The vegetation in the northern Federseeried is mown twice a year and the mown grass is used for agricultural purposes, e.g. pressed as pellets or as bedding for livestock. The aim is to preserve an open landscape and wet meadow. The peatland lake Federsee is on average only 1 m deep and surrounded by a reed belt up to 100 m wide. The reed belt also covers large parts of the water surface.

3 Methodology

3.1 Satellite Data

Data from ESA’s satellites Sentinel-1 and Sentinel-2 are used. We use Sentinel-1 Ground Range Detected (GRD) C-band data acquired using interferometric wide swath mode (IW). For the study area described above, a Sentinel-1 product is downloaded for each month of 2019 and both VH and VV polarisation are used. The backscatter data are preprocessed using SNAP, which involves thermal noise removal, radiometric calibration, decibel (dB) conversion and terrain correction. These processing steps result in the \(\sigma ^0\) (dB) backscatter data we use in the further course. Furthermore, we use one Sentinel-2 Level 2A product per month of 2019. No cloud-free image are available for the months of January and November, which is why these months had to be omitted from the evaluation. In each case, the product with the lowest cloud cover and temporally closest to the date of recording of the Sentinel-1 data set is selected in order to keep the comparability of the data sets as high as possible. We use band 3 (GREEN = 560 nm), band 8 (NIR = 842 nm) and band 11 (MIR = 1610 nm).

From these bands we calculate the MNDWI according to Xu (2006) as follows:

$$\begin{aligned} \mathrm {MNDWI} = \frac{\mathrm {GREEN} - \mathrm {MIR}}{\mathrm {GREEN} + \mathrm {MIR}} \end{aligned}$$
(1)

Furthermore we calculate the NDWI developed by McFeeters (1996).

$$\begin{aligned} \mathrm {NDWI} = \frac{\mathrm {GREEN} - \mathrm {NIR}}{\mathrm {GREEN} + \mathrm {NIR}} \end{aligned}$$
(2)

Both indices range between \(-1\) and \(+1\). Water surfaces usually have a high MNDWI. A high NDWI indicates a high water content of the surface.

3.2 In Situ Data: Lake Level and Water Table Depth

Fig. 2
figure 2

Lake level Federsee 2019 [cm], measured daily. Min: 53 cm, max: 85 cm, mean: 64 cm above the lake bottom

Fig. 3
figure 3

Average water table depth (WTD) in Federseeried 2019 [cm] of all wells, measured weekly. Min: \(-54\) cm, max: \(-12\) cm, mean: \(-26\) cm below the ground surface

The water level of the lake Federsee is measured daily (Fig. 2). In 2019, the lowest lake level was at 53 cm, the highest at 85 cm and the average at 64 cm above the lake bottom. The WTD of the peatland Federseeried is monitored at about 85 observation wells. Figure 3 shows the average WTD at all monitoring wells over the year 2019. The average level was \(-24\) cm with a maximum of \(-12\) cm and a minimum of \(-54\) cm below the ground surface. A groundwater level of more than \(-10\) cm below the soil surface is the target in natural peatlands. The measurements are taken weekly by employee of the NABU Nature Conservation Centre Federsee (Naturschutzbund Deutschland-Naturschutzzentrum Federsee). The collection of data is carried out manually. In the WTD data set, only the corresponding calendar weeks are noted, not the exact day of the measurement.

The year 2019 was one of the four hottest years in the state of Baden-Württemberg since the beginning of consistent weather records in 1881. The water reserves in the soil, which were greatly reduced by the drought year 2018, were not able to fully regenerate in 2019 (LUBW 2020).

3.3 Lake Size Measurement

Fig. 4
figure 4

Overview of the methods and data for lake size measurement

A quick and easy way to detect the open water area via the combination of the different features (\(\sigma ^0\)-VH, \(\sigma ^0\)-VV, MNDWI) is by using the machine learning algorithm random forest. However, the random forest algorithm requires training data, which are not always available and the generation is labour-intensive. Also, analysing the whole model and the way the algorithm makes its classification is very challenging. Therefore, we additionally consider the features individually and combine them with two methods whose decision basis we know and determine ourselves. To do this, we first consider all three features and use them individually for classification. The three classification results of the individual features are combined and a statement about the probability of the pixels to belong to the water area is made using the mathematical Dempster–Shafer approach. Furthermore, the features are extracted, weighted and the lake area is delineated by analysing the histogram of the fusion image. An overview of the proceeding is given in Fig. 4.

3.3.1 Supervised Classification with Random Forest

Breiman (2001) describes random forest as ensemble of tree predictors that vote for the most popular class. Each tree depends on a random selection of features, which proves to be more robust with respect to noise (Breiman 2001). Variable importance can be indicated by internal estimates (Breiman 2001). Compared to the fusion image, random forest offers the advantage that weights for each data set are learned individually. A disadvantage of the method is that training data is required, which was not available in this project and had to be created manually. As features we use the three data sets \(\sigma ^0\)-VH, \(\sigma ^0\)-VV and MNDWI.

3.3.2 Unsupervised Classification with Feature Analysis

Feature analysis The three data sets \(\sigma ^0\)-VH, \(\sigma ^0\)-VV and MNDWI served also as inputs for an adaptive thresholding method (Fig. 5). This method is applied to each data set individually. For \(\sigma ^0\), the minimum between the two peaks of the histogram is selected as the threshold for the separation between the two surfaces, water and land (Fig. 5c).

Because the determination of the threshold value is ambiguous for the MNDWI data examined using the adaptive threshold value procedure explained above, a fixed threshold value of \(-0.25\) is chosen. The threshold value results from the visual interpretation of the MNDWI images.

The images were divided into two segments, land and lake, using the threshold (Fig. 5c, d). To determine the lake size, only the pixels that fulfils the following conditions were considered as lake area: ‘pixel-value < threshold and located in the lake section’ for \(\sigma ^0\) and ‘pixel-value > threshold and located in the lake section’ for MNDWI.

Fig. 5
figure 5

Method adaptive thresholding a Sentinel-1 data: \(\sigma ^0\)-VH (dB), b red polygon: lake section, c histogram of \(\sigma ^0\)-VH of the lake section with a threshold at \(-22\) dB separating lake-pixel (left) from land-pixel (right), d created lake mask of \(\sigma ^0\)-VH on 2019-07-13, blue: lake (pixel-value < threshold and located in the lake section), red: misclassified area (pixel-value < threshold but not located in the lake section)

The size of the lake mask is therefore directly dependent on the threshold. Since there is no reference data for the lake size, we evaluate the measurements by forming correlations between lake size and lake level. We assume that the area of the lake also increases at a higher lake level. The correlation between lake level and lake size is determined using Pearson’s correlation coefficient.


Fusion using weighted features Furthermore, we generate a fusion image by weighting and combining the three features. In the first step, the data must be scaled so that features with a large range of values do not dominate the fusion result. The images are scaled to a value range between zero and one. Due to the feature properties, MNDWI high values are characteristic for lake area, while for \(\sigma ^0\) water areas have a small value. Hence, to combine the data sets in a meaningful way, part of the data sets must be inverted. We calculate the fusion image via the following expression:

$$\begin{aligned} \begin{aligned} \mathrm {Fusion\,image}= & {} \,G_{\mathrm {VH}} \cdot \sigma ^{0}_{\mathrm {VH}} + G_{\mathrm {VV}} \cdot \sigma ^{0}_{\mathrm {VV}} \\&+ \, G_{\mathrm {MNDWI}} \cdot \mathrm {MNDWI} \end{aligned} \end{aligned}$$
(3)

The weighting G is set as \(1+r\) and results from the correlation coefficients (r) between lake area and lake level (Table 2). This ensures that a data set which is considered more accurate has a greater impact on the overall result than a data set which is considered less accurate. The adaptive thresholding method is then applied on the fusion image. The threshold is set at the minimum between the two peaks of the histogram, followed by the calculation of the lake size as described above.


Dempster–Shafer theory adaptation Now, the Dempster–Shafer theory is applied to analyse the credibility of the created lake masks. The Dempster–Shafer theory of belief functions is based on Dempster’s work on upper and lower probabilities in the 1960s and Shafer’s monograph in the 1970s (Yager 2008). A special feature of the Dempster–Shafer theory is the evidence that allows uncertainty to be expressed explicitly. An additional advantage of the theory is that it has a combination rule to combine evidence from multiple independent items. The theory is usually used when the credibility of the individual items is known and the significance of a common hypothesis is to be determined. In this project, the three lake masks (based on the adaptive thresholding method of \(\sigma ^0\)-VH, \(\sigma ^0\)-VV and MNDWI) serve as independent items. However, the credibility of the masks, represented by a probability number, is not known and cannot be determined due to the lack of reference data. Therefore, we create two different scenarios of credibility for each mask and analyse the resulting mass function of the common hypothesis for realism (Table 2). If we consider the two states (lake pixel or non lake pixel) and the three lake masks (\({\mathrm{Mask}_{\sigma ^0{\text {-VH}}}}\), \({\mathrm{Mask}_{\sigma ^0{\text {-VV}}}}\), \(\mathrm {Mask_{MNDWI}}\)), each of which giving an independent statement, eight different classes of combinations result (Table 3). Assuming the considered pixel is a lake pixel, the mass functions m(A) listed in Table 3 are calculated using the basic probability numbers from Table 2 for both scenarios and the formulas to mass functions according to Yager (2008).

The scenarios are presented in Sect. 4.1. Based on the results, a statement is to be made which scenario achieved a more trustworthy image. Scenario 1 is based on the results of the fusion image, scenario 2 is based on the results of random forest. A scenario with high probabilities for lake pixels in the lake section and low probabilities for lake pixels in the surrounding area is considered probable.

3.4 Water Table Depth Estimation

Next, we present the methodology used to explore the relationship between the peatland WTD and VH and VV polarized \(\sigma ^0\) as well as the relationship between WTD and NDWI. Unlike previous approaches the use of a speckle filter was dispensed with in the preprocessing of the SAR data. The reason for this was to counteract a falsification of \(\sigma ^0\) of the cell covering the water table depth measurement point. In contrast to the measurement of lake area, which rely on a certain homogeneity of adjacent pixels, this method is interested in the exact values at the given well coordinates.

For every observation date we extract data from the WTD data set measured in the same week. Also we extract pixel-values from the satellite images. For each observation well, we use only the single pixel-values that cover the location of the well. The correlation between WTD and \(\sigma ^0\) or NDWI is determined using Pearson’s correlation coefficient.

The analysis of WTD in our project is limited to individual points in time over the course of a year. The spatial correlation between the in situ WTD values and the remote sensing data is analysed. In order to analyse the temporal course, which would be interesting for the monitoring of renaturation, it would make sense to generate time series over several years and with all available SAR data sets. Since the quality of the in situ data is limited (no exact date of measurement given, only the calendar week) and in order to keep the processing effort low, we only analyse selected individual points in time for their spatial correlation in this work.

4 Results

4.1 Results on Lake Size

Table 1 Results of lake size [km\(^2\)] using the different lake masks with mean (Ø) and standard deviation (SD)
Table 2 Correlation coefficient r for the relation between lake size (from adaptive thresholding) and lake level, basic probability numbers for scenario 1 (S1) and scenario 2 (S2) for analysing lake size according to Dempster–Shafer theory, results of the random forest (RF) algorithm in SNAP: main feature (number of evaluations in which the lake mask was classified as the most important feature by the random forest algorithm) and testing feature importance score (average feature accuracy output from the random forest method in SNAP)
Table 3 Classes, conditions and mass functions m(A) after Dempster–Shafer theory for scenario 1 (S1) and scenario 2 (S2), m(A) is calculated using the basic probability numbers from Table 2 and the formulas to mass functions according to Yager (2008)

The results of the lake size measurements with the different methods are presented in Table 1. The area of the masks is calculated by multiplying the number of pixels inside the mask by their size.

The largest lake masks are those from the radar data. The smallest are those of class 1. This is logical, since the condition of class 1 is that a pixel must be classified as a lake by all three lake masks that use a single data set. The standard deviations of the areas range between 0.016 km\(^2\) (\(\mathrm {Mask_{FI}}\)) and 0.037 km\(^2\) (\(\mathrm {Mask_{MNDWI}}\)). For the analysis of the lake area measurements, we evaluate which method captures the most pixels that lie within the lake section and the least pixels that lie outside this area and are therefore defined as misclassified. Furthermore, it is considered which method has the highest correlation with the lake level. The correlation is determined only for the lake masks using one data set. Since the satellite images from Sentinel-1 and Sentinel-2 are often not taken on the same day, the fusion methods cannot be assigned an exact lake level for a given day. This would reduce the significance of the correlation.

Some of the histograms prove to be suitable for establishing a separation between lake and land. The histograms show two peaks, which can be easily distinguished from each other and can be assigned to the land or water area (Fig. 6). The threshold varies nearly every month. The measured values for the water areas are in the range of \(-25\) to \(-15\) dB for the VV polarisation and \(-22\) to \(-32\) dB for the VH polarisation. The average thresholds are \(-22.4 \pm\) 1.1 (VH) and \(-16.1 \pm\) 1.9 (VV).

Fig. 6
figure 6

Example histograms of the lake section for a \(\sigma ^0\)-VH 2019-10-11, the histogram shows two peaks, which can be easily distinguished from each other and are assigned to the land or water area. b \(\sigma ^0\)-VV 2019-09-16, peaks, minimum between the peaks and threshold are more difficult to identify

The measurement of the lake area via the adaptive threshold method using VH polarised \(\sigma ^0\) shows a higher correlation with the lake level (r = 0.72) than VV polarised \(\sigma ^0\) (r = 0.46). The determined lake size over the fixed threshold of \(-0.25\) also shows a relatively strong correlation (r = 0.61) (Fig. 7).

Fig. 7
figure 7

Relation between lake level [cm] and measured lake size [km\(^2\)] from a \({\mathrm{Mask}_{\sigma ^0{\text {-VH}}}}\), b \({\mathrm{Mask}_{\sigma ^0{\text {-VV}}}}\) and c \(\mathrm {Mask_{MNDWI}}\) by adaptive thresholding method, resulting Pearson’s correlation coefficients: \(r_{\mathrm {VH}}\) = 0.72, \(r_{\mathrm {VV}}\) = 0.46, \(r_{\mathrm {MNDWI}}\) = 0.61, the outliers are labelled with the data, all data are related to the study period 2019

In terms of the misclassified areas in the whole image section, the method of the weighted fusion image and the random forest algorithm showed the best results (both Ø = 0.2%). The misclassified areas for the adaptive thresholding method for \(\sigma ^0\)-VH, \(\sigma ^0\)-VV and MNDWI are significantly larger (Ø = 7.0%, Ø = 6.6% and Ø = 2.2%) (see Fig. 5d).

In addition to the label image, the random forest outputs a confidence image. This shows an increased uncertainty in the lakeshore area (Fig. 8).

Fig. 8
figure 8

Random forest results of 2019-12-16 (Sentinel-1) and 2019-12-03 (Sentinel-2). a Label image, purple: lake. b Confidence image shows an increased uncertainty in the shore area of the lake

Next we consider the analysis of the areas following the Dempster–Shafer theory of belief functions. First, we distinguish two scenarios: In scenario 1 (S1), as with the weighted fusion image, the basic probability numbers are set based on the correlation of the calculated lake areas with the lake level. The lake mask that achieves the highest correlation with lake level is assigned a basic probability number of 0.9, the lake mask with the next highest correlation is downgraded by 0.1 to 0.8, just as the third lake mask is downgraded to 0.7. The distance between the basic probability numbers also approximately mirrors the distance between the determined correlation coefficients. In scenario 2 (S2) the basic probability numbers are set according to the results of the random forest algorithm in SNAP. The feature (here lake mask) which the algorithm selects to be more relevant for the classification is evaluated. In this case, the algorithm classifies one lake mask as the most important feature in five out of nine months with an importance score of 0.12. This lake mask is assigned a basic probability number of 0.9 and the mask with the lowest result is set to 0.6. A linear equation can be calculated from these two pairs of values that can be used to determine the value for the third mask. This results in a value of 0.7 (Table 2).

For the evaluation, the areas with the calculated mass functions are considered and it is analyzed which scenario is estimated to be more likely. Pixels that are recognized as lake in all three masks have the highest body of evidence of being a lake pixel (99% in both scenarios, Ø 1.319 km\(^2\)). Here, as in the previous fusion image, it is evident that the combination of the various data sets can highly reduce the misclassified areas (Fig. 9).

Fig. 9
figure 9

Lake masks for the entire study area in black (without the lake section-condition), Sentinel-1 data from July 13th and Sentinel-2 data from July 4th. a \({\mathrm{Mask}_{\sigma ^0{\text {-VH}}}}\), b \({\mathrm{Mask}_{\sigma ^0{\text {-VV}}}}\), c \(\mathrm {Mask_{MNDWI}}\), d \(\mathrm {Mask_{FI}}\) from the fusion image (FI), e \(\mathrm {Mask_{RF}}\) from the random forest (RF) classification, f \(\mathrm {Mask_{Class1}}\). Class 1 after Dempster–Shafer-Theory adaption, Condition: \(\mathrm {Mask_{MNDWI}}\) = lake, \({{\rm Mask}_{\sigma ^0{\text{-VH}}}}\) = lake, \({\mathrm{Mask}_{\sigma ^0{\text {-VV}}}}\) = lake. \(m(A)_\mathrm {S1}\): 99%, \(m(A)_\mathrm {S2}\): 99%. The fusion of all three data sets results in almost no areas outside the lake being misclassified as lake

Especially relevant for the evaluation are class 2, because the difference between the determined mass function is the highest (S1: 87%, S2: 41%), and classes 3 and 8, because they each have the second highest values. Figure 10 shows the areas that fulfil the condition according to class 2 (Ø 0.080 km\(^2\)). It is noticeable that especially on the days in September, October and December the areas can be clearly assigned to the lakeshore region. Referring to the condition of class 2 (MNDWI = no lake, \(\sigma ^0\)-VH = lake, \(\sigma ^0\)-VV = lake) the \(\sigma ^0\) lake masks include the lakeshore region, while the MNDWI mask excludes this region.

Fig. 10
figure 10

Class 2 after Dempster–Shafer–theory adaption (Condition: \(\mathrm{Mask}_\mathrm{MNDWI}\) = no-lake, \({\mathrm{Mask}}_{\sigma ^0{\text {-VH}}}\) = lake, \({\mathrm{Mask}}_{\sigma ^0{\text {-VV}}}\) = lake; \(m(A)_\mathrm{S1}\): 87%, \(m(A)_\mathrm{S2}\): 41%) Pixels that meet the conditions of class 2 are often located in the lakeshore region. The lake masks resulting from the SAR data measure a larger lake area than the lake masks resulting from the optical data

Class 3 (MNDWI = lake, \(\sigma ^0\)-VH = lake, \(\sigma ^0\)-VV = no lake) has the second highest mass function after scenario 1 with 94%. The pixels that meet this condition are in the lake area, but they are small areas compared to class 1 (Ø 0.028 km\(^2\)).

Class 8 (MNDWI = lake, \(\sigma ^0\)-VH = no lake, \(\sigma ^0\)-VV = lake) has the second highest mass function in scenario 2 at 88%. This class has less area than class 3 (Ø 0.007 km\(^2\)) and mostly pixels in the lakeshore region.

For a more accurate assessment, the following considers how much class 1 and 3 (S1), and class 1 and 8 (S2) do not lie in the lake section and thus counts as misclassified area. Scenario 1, which is based on the weighted fusion image, has less misclassification on average (Ø = 0.11%). Scenario 2, which is based on the classification according to the random forest algorithm, misclassifies on average a slightly higher area surrounding the lake as lake (Ø = 0.13%).

4.2 Results on Water Table Depth Estimation

The correlation coefficients between \(\sigma ^0\) data and WTD are shown in Fig. 11. The observation of the correlation coefficients mainly shows no or weak correlations between the WTD and \(\sigma ^0\). Moreover, no stable correlation can be observed over all data sets. If only the observation wells of the northern Federseeried are used for the evaluation, the correlations stabilise (Fig. 11b).

Fig. 11
figure 11

Correlation coefficients (r) for water table depth (WTD) and \(\sigma ^0\)-VH or \(\sigma ^0\)-VV. a Whole peatland, b only northern Federseeried

It can be seen that the correlation coefficients from January to May for both polarisation are in a similar range of values (about 0.2–0.4), with the exception of the values of the VV polarisation on March 15th, which, however, generally show conspicuousness. The VH polarisation shows stronger correlations with the WTD than the VV polarisation. On June 18th and July 13th, a negative correlation is calculated for both polarisation (\(-0.10\) and \(-0.29\)). From August to December both polarisation showed a large variability in correlation coefficients (\(-0.22\) to 0.46).

Fig. 12
figure 12

Correlation coefficients (r) for water table depth (WTD) and NDWI. a Whole peatland, b only northern Federseeried

Next, the correlations between WTD and NDWI are presented (Fig. 12). A consideration of only the northern Federseeried with lower vegetation is also carried out here (Fig. 12b). The correlation coefficients are higher in September 19th, October 14th and December 3rd (0.41, 0.44, 0.41) and lower on February 21th and March 21th (0.25). Again, the opposite behaviour of June and August (\(-0.44\), \(-0.33\)) is conspicuous.

5 Discussion

Looking at the correlation between the lake masks of the individual data sets and the lake level, we see that the \(\sigma ^0\)-VH data has the highest correlation with 0.7. The fact that the measurement of water area via the adaptive thresholding of VH polarisation has the highest correlation with lake level and scenario 1 which weights \(\sigma ^0\)-VH stronger has a slightly lower misclassification leads us to the statement that this polarisation is more suitable for monitoring the peatland in this project. This differs from earlier assumptions from other projects such as Huang and Jin (2019). The better suitability of VH polarisation could be due to the fact that it is not as susceptible to the double bounce effect as VV polarisation caused by the reeds on the lake.

The correlation could have increased if more time points had been included in the evaluation. Schwatke et al. (2019) use for the measurement of lake size five different water indices for the time period between 1984 and 2018. They improve correlations between lake level and the lake size they measured on average from 0.610 of up to 0.834 with their method of filling data gaps by using a long-term water probability mask. Our work shows that lake size can also achieve high correlations via radar measurements. Schwatke et al. (2019) describes, as we do, that it was not possible to read a suitable threshold for MNDWI from the histograms. They solve this problem by developing a new approach for automated threshold computation by combining the information of all five indexes. The challenge of finding a suitable threshold for masking the water area is also described by Pulvirenti et al. (2011). They suggest that the problem with setting the threshold for SAR data is based on the roughening effect, which is caused by wind speed. In our case, the water surface is mainly rough because of the dense plant cover of reeds. For January and March it was not possible to find a suitable threshold in our SAR data. In January this was probably due to the ice cover on the lake. In March we can only explain the noisy SAR image by a sensor problem. In general, to improve the thresholding an automated and mathematically more accurate method for adaptive thresholding can be applied.

Now, we will consider the fusion of the data sets. First of all, it has to be said that problems can occur when combining data from different satellites, because e.g. the overflight times do not match and a time gap occurs. Second, some Sentinel-2 data cannot be used due to cloud cover. In addition, there are uncertainties resulting from the different spatial resolution of the data (Sentinel-1 5 m \(\times\) 20 m; Sentinel-2 VIS: 10 m, SWIR: 20 m; WTD point data). Nevertheless, the results show that combining the data is superior to using one single data set. The Dempster–Shafer approach has also shown that combining both satellite data gives a very reliable estimate of the lake area. The weighted fusion image and the random forest algorithm, both using Sentinel-1 and Sentinel-2 data, show in terms of misclassification that they are better suited for water area detection than the individual data sets. In situ data are used to create the fusion image, but not for the random forest classification. The fact that the classification using the random forest algorithm yields comparable results to the thresholding of the fusion image suggests that a good estimate of the lake area is possible without in situ data. However, it must be noted that even the fusion image does not necessarily provide the best estimate of the lake area, because it is questioned whether a rigid weighting over the entire year is useful, since it is shown that the significance of the different data sets varies strongly over the year (Table 1). Therefore, the random forest with its adjusted weighting may even be more appropriate.

Considering both the random forest algorithm and the Dempster–Shafer theory, it is noticeable that the lakeshore region has a large uncertainty in the classification. Schwatke et al. (2019) also see uncertainties in the lakeshore area in the analysis of the five water indices. They consider the reason therefore is mainly the different colour in shallow water. However, this phenomenon also occurred in our study when comparing radar data with a water index. Class 2 clearly shows that the MNDWI mask is smaller than the radar masks. Martinis et al. (2022) create water masks based on analysis of Sentinel-2 data and use Sentinel-1 data to fill data gaps due to cloud cover. They note that in the shore area, when the water is clear, optical sensors can receive signals from the bottom of the water body and this can lead to an underestimation of the water area. Peña-Luque et al. (2021) also present a methodology to generate large-scale water maps by merging Sentinel-1 and Sentinel-2 data. In contrast to Martinis et al. (2022) they observe, just like Kaplan et al. (2019), that radar would underestimate the areas more than optical methods. We cannot confirm the poor penetration of the radar signal, which Kaplan et al. (2019) mention as a reason for the different lake sizes from our research. The fact that our MNDWI lake masks are smaller than the radar masks is probably due to the lower threshold value of \(-0.25\) instead of 0, which is normally chosen as threshold for MNDWI. In addition to the choice of threshold, we also see that a clear ecological boundary between land and water is difficult to define in the peatland ecosystem. Higher resolution and the inclusion of ecological data could reduce these uncertainties.

Since the amount of data we analysed regarding WTD is small compared to other publications, we discuss our results primarily with the findings from other research rather than deriving conclusions from our data. Asmuß et al. (2019) obtained a temporal spearman correlation coefficient of 0.45 (± 0.17) between \(\sigma ^0\) and WTD on three study sites in a backscatter time series of two years. Bechtold et al. (2018) studied 17 peatlands in Germany over three years and achieved correlations of 0.38 for natural peatlands and 0.54 for drained peatlands used for agriculture. Their spatial resolution is about 100 m (Asmuß et al. 2019) and about 500 m (Bechtold et al. 2018) is coarser than ours with about 20 m.

Similar to us, Asmuß et al. (2019) observe a loss of correlation during summer. In both of our Sentinel data, anomalies can be observed in the summer months of June 18th/19th and August 12th/18th (in Sentinel-1 also in July 13th). So as mentioned by Asmuß et al. (2019) and also by Bechtold et al. (2018) one possible explanation is mowing. According to the NABU nature conservation centre Federsee the vegetation growing in the northern Federseeried is mown twice a year, in June and August, in order to maintain the meadows as open landscape and wet meadows. If the mown grass is left on the meadow to dry, entirely different surface characteristics result, which can strongly influence both the optical and SAR data. We can therefore confirm the effect of cuts in our data.

Asmuß et al. (2019) consider soil and vegetation information in their method. The difference between the correlation coefficients of the whole peatland and those of the northern Federseeried shows that vegetation has a strong influence on the correlation between the backscatter of the SAR measurement and the WTD. The vegetation height is much lower in the northern Federseeried and can therefore be penetrated more easily by the radar signals.

Large parts of the Federseeried can be described as natural peatlands. Although the northern Federseeried started to be rewetted four years ago, our results with the higher correlations according to Bechtold et al. (2018) resemble an agriculturally used drained peatland.

In further work we could consider more additional information like Asmuß et al. (2019) and we could extend the time period and the study area to generate more stable and robust information. Another problem we see is the comparability of the different spatial resolutions of our data. Given that the water level measurements are point data and the remote sensing data are raster data, uncertainty arises. Also, as in the lake size methods a fusion of the remote sensing data can be considered.

6 Conclusion and Outlook

In the field of environmental protection, there is a need for evaluating methods, which the actors can use over long periods of time to identify trends in the protected areas. The use of freely available Sentinel data and working with the open-source software SNAP and QGIS as well as the easy-to-use random forest algorithm make this possible.

Our project shows despite the good suitability of VH polarised \(\sigma ^0\) that the meaningful combination of different remote sensing data is superior to the consideration of single features. The two methods that combined \(\sigma ^0\)-VH, \(\sigma ^0\)-VV and MNDWI, the weighted fusion image and the random forest algorithm, produced comparable results. We consider the replacement of lake level measurement by lake size measurement through remote sensing to be a suitable alternative. Potentials lie in the inclusion of further environmental conditions such as temperature and precipitation, in the combination with biological and ecological knowledge, and in the application of other machine learning or deep learning methods. This way, the complex interrelationships within the ecosystem could be learned and not only could the current state be represented, but also could the future performance of the peatland be estimated. We see measuring water levels in organic soils as a big task and have identified the comparability of different spatial resolutions, the effects of peatland management (e.g. grass cuts) and the acquisition of suitable in situ data for further research as future challenges.

The advantage of remote sensing is that data are collected worldwide and therefore methods that use this data can easily be applied all over the world. However, environmental conditions are diverse and vary from region to region. Whether methods that have been tested successfully in one region can be transferred to other regions is not certain. The challenge will be to develop unified procedures and provide accurate and comparable results despite the complexity of ecosystems. So remote sensing can be a tool to create databases, which can be used for example for recording the climate protection services of ecosystems. The comparability of the results is particularly relevant because climate protection services are to be traded on a CO\(_2\) market in the future (Michel 2021).