1 Introduction

Flooding is a severe natural disaster. It can be caused by various reasons. In the coastal areas, tropical cyclone-induced coastal flooding is the combined effect of storm surge-caused sea water inundation and rainfall-induced freshwater flooding. If tropical cyclone-induced flooding occurs at the same time as the rainy season, the consequences will be even more serious. If flooding occurs in locations with dense populations and large cities, it will result in huge loss of life and property [36]. For example, on August 26–28, 2017, Harvey lingered over the Houston area, a densely populated place, dumping massive amounts of rain. There were over 80 fatalities as a result of the extraordinary flooding [33]. Harvey produced over 125 billion dollars in damage, according to the National Hurricane Center.

Coastal flooding may become considerably severe in the future as a result of climate change and anthropogenic activities. For starters, greater temperatures may lead to more moisture in the atmosphere, enhancing the intensity of the flood [34]. Climate warming had increased the average and extreme rainfall of storms Katrina, Irma, and Maria, according to Patricola and Wehner’s simulations [23]. Human activities, according to Bilskie et al. [2], can exacerbate the impact of coastal inundation on infrastructure. The study [38] found that urbanization worsened both the flood response and the total rainfall from hurricanes. These studies should raise our awareness of increased flooding in highly urbanized and densely populated coastal areas, both in developed and developing countries.

Accurate flood mapping can help emergency managers create more focused disaster response strategies, as well as researchers better understand flooding dynamics and study on more accurate forecasting methods. Ground surveys or information retrieval from remote sensing imagery can be used to map floods. Ground surveys are direct and exact, but they are expensive, and certain regions are inaccessible to humans after flooding. Flood mapping from remote sensing data is a means of low cost, and it could map areas human cannot access. The first remote sensing data source is optical data. The optical images are easy for human to interpret and then use. Optical sensors do not work at night and cannot see through cloud. This limits the applicability of optical remote sensing in information extraction during flooding. The second data source is radar remote sensing, especially the synthetic aperture radar (SAR) remote sensing with the ability of providing high-resolution images. SAR is an useful remote sensing tool for flood mapping since it can imaging floods at any time of day or night and in almost any weather condition. This ability is especially useful for mapping the dynamic flooding to understand flooding mechanisms and provide disaster relief plans.

Traditional flood mapping techniques using SAR data rely on image processing techniques that use backscattering, statistical, and polarimetric information. These methods include histogram thresholding [3], active contour segmentation [13], region growing [21], change detection [9, 20], statistical classification [10], neuro-fuzzy classification [6], multi-temporal statistics [4], pixel-based supervised [35], and object oriented rule-based classification [25]. Although traditional methods have achieved good results in some cases and some of them are even used in practical applications, they mine multi-dimensional SAR data using human-crafted features and rules to achieve flood mapping. It is difficult for human-crafted features and rules to guarantee stable performance under a variety of influences, including: (1) speckle; (2) temporal mis-registration; (3) imaging system parameters [22]; (4) meteorological factors [12, 18]; and (5) environmental conditions.

Deep learning (DL) technology, particularly deep convolutional neural network (DCNN) models, offers a promising route for reliable flood mapping. Instead of being pre-defined, the features for reliable flood classification in the DCNN models are mined from the multi-dimensional SAR data directly. These data-driven models are capable of offering reliable characteristics under a variety of influencing conditions, and they are optimized from data to information in an end-to-end style. This concept has been proven in a variety of communities, including computer vision [29], biomedical image processing [7] and geoscience [15, 26, 39]. DCNN-based methods for flood mapping have been proposed recently. Kang et al. [14] demonstrated that a fully convolutional network, which is one type of DCNN model, can produce more precise flooding mapping than previous approaches. Rudner et al. [28] presented a DCNN-based method for retrieving flooded built-up areas that shows promise. We [18] presented an modified DCNN method for coastal flooding mapping from multi-temporal dual-polarimetric SAR data that offers reliable results, and this method is suitable for spatial and temporal investigation of storm-caused coastal flooding. For flooding mapping in built-up areas from high-resolution SAR imagery, Li et al. [16] presented an active self-learning DCNN model.

We believe the DCNN models can overcome the difficulties of robust flooding mapping, based on our past research [18, 19]. The DCNN-based SAR coastal flooding mapping network (SARCFMNet) is described in this chapter. SARCFMNet is a model designed specifically for coastal flooding mapping. It has two improvements that increase accuracy and robustness: (1) the physics-aware input information design fuses temporal and polarimetric information for more reliable mapping and integrates radar remote sensing mechanisms of flooding extraction into DCNN; (2) the regularization scheme useful for fully-convolutional networks enhance the model’s reliability. The SARCFMNet was trained and tested using a dataset of coastal flooding in Houston, Texas, induced by Hurricane Harvey in 2017. The flooded regions, which cover around \({4000}\,\textrm{km}^{2}\), are delineated and studied in these images. The contributions of this study are listed as follows:

  • Compared to the commonly used, benchmarking DCNN approach, the SARCFMNet performs better and is more stable. This demonstrates that the design of physics-aware input information and the regularization scheme can improve the performance and reliability.

  • The spatial and multi-temporal characteristics of the Harvey-caused inundation are investigated using the mapping results.

  • The wind influence is revealed, implying that DCNN models considering wind impact could improve reliability in practice.

  • The cost-sensitive losses for DCNN models are investigated, which might be beneficial for more adaptive models that take performance costs into account.

  • The trained and tested SARCFMNet model is applied to Bangladesh, which is one of the United Nations (UN)-defined least developed countries, to get nation-level, multi-year, high-temporal-resolution flooding maps. This can help us get deeper understanding of the flooding mechanism of this country.

The chapter is organized as follows. In Sect. 2, we will introduce the dataset used for the model training and testing. The model is described in Sect. 3. In Sect. 4, the model performances are presented. In Sect. 5, the multi-year flooding maps of Bangladesh are given with discussions. Section 6 concludes this chapter.

2 Dataset

2.1 Data Description

The dataset used for the training and testing of the SARCFMNet model is collected from Sentinel-1 SAR data during the Hurricane Harvey. Around the end of August 2017, Hurricane Harvey caused damage on the Houston region. Six pairs of Sentinel-1 SAR images were obtained in the research place during this time. The images are with VH and VV polarizations. One pair is from the Stripmap (SM) mode, while five pairs the Interferometric Wide (IW) swath mode. The products of Ground Range Detected are utilized. Table 1 lists the data parameters in the dataset. The IW01 pair’s post-event image is impacted by strong wind. Harvey had degraded to a Topical Storm by the time this image was taken, but it still delivered powerful winds to the scene, with the speed of around 20 ms\(^{-1}\) [31]. We labeled the flooded regions as the ground truth using land-cover categories from Google Earth and OpenStreetMap and Copernicus Emergency Management Service Rapid Mapping products [5].

Table 1 Descriptions of the image pairs for generating the dataset in this study
Fig. 1
figure 1

Illustration of one image pair constructing the dataset

In Fig. 1, we give a visual illustration of one pair from the data constructing the dataset, the SM01 pair. In this figure, the first and second rows show the images of the VV and VH. In these two rows, the first and second columns show pre- and post-event images respectively. The OpenStreetMap and Google Earth image of the region are in the third row. Houston’s western and southern areas are covered by the SM01 pair.

2.2 Data Preparation

The original SAR images are processed in the following steps, as illustrated in Fig. 2, to construct the dataset for model training and testing.

  1. 1.

    Application of orbit file: The accurate satellite orbit files are applied for the SAR products.

  2. 2.

    Filtering with sliding windows: To lessen the impact of speckle on SAR images, a filter is performed.

  3. 3.

    Radiometric calibration: After this calibration, the pixel values of the SAR images represent the back-scattering information (\(\sigma ^0\)).

  4. 4.

    Conversion to dB: The linear scale \(\sigma ^0\) is converted to decibel (\(\sigma ^0_\text {dB}\)). The normalized radar cross section (NRCS) images in dB are generated.

  5. 5.

    Terrain correction: The SAR images representing the \(\sigma ^0_\text {dB}\) information are geocoded into a geographical coordinate system, which is the World Geodetic System 84. The ocean is masked out with the digital elevation model (DEM) information. After this, each pixel occupies 8.9832e-5 degrees.

  6. 6.

    Subset generation: The pre- and post-event images are transformed into the same coordinate system for each pair of data used to create the dataset. We trimmed the subsets from the pre- and post-event images by the same coverage.

The pre- and post-event images are geometrically matched after the preprocessing. We cut each pair into \(256 \times 256\), non-overlapping samples with pre- and post- multiple channels. For all the pairs, the sample numbers are shown in Table 1.

Fig. 2
figure 2

Flowchart of data preprocessing

Fig. 3
figure 3

The proposed model design. a The proposed SARCFMNet model structure. b The abstracted model design can be generalized to multiple ocean remote sensing image information mining problems

3 Model

The SARCFMNet model is specially tailored from the U-Net model [27] for its verified effectiveness. The U-Net model was proposed for biomedical image segmentation. Its architecture was created so that it could function with less training samples while still producing precise segmentations. The proposed SARCFMNet is shown in Fig. 3a. The network can be divided into two paths. The left path is an encoding path to extract abstracted features for accurate classification with down-sampling stage by stage. The right path is a decoding part to up-sample the feature maps. There are skip connections from the encoding to the decoding path to provide latter the high-resolution features via concatenation. As illustrated in Fig. 3a, the encoding phase consists of \(3 \times 3\) convolutions activated by the rectified linear unit (ReLU) and \(2 \times 2\) max-pooling. \(3 \times 3\) convolutions with ReLU activation and \(2 \times 2\) up-sampling operations constitute the decoding part. The output layer of the model is a Sigmoid-activated \(1 \times 1\) convolution. Thus, this model can predict the probability of each pixel as flooding. The loss function of the model is the binary cross-entropy (BCE) loss [17]. Pixel-wise classification accuracy is used as metric to evaluate model performance.

In the SARCFMNet model, there are two specially-tailored modifications designed for coastal flooding mapping.

  1. 1.

    Physics-aware input information design—Defined by the problem, we design different input information combinations as in Fig. 3a. For the \(\sigma \), its superscript indicates pre- or post-event information, and its subscript shows polarization. Bi-temporal information with pre- and post-event information from one single polarization is often used in flooding mapping [3, 10, 14]. This is a direct design. In this study, based on the radar remote sensing physics, we propose that the VV and VH polarization information should be fused, since the two polarizations can compensate each other. In addition, we propose the temporal difference images should also be used. From Sect. 2.2, we know that the preprocessed images represent the backscattering information in the log-scale. Therefore, the temporal difference images \(\sigma ^\text {post}_\text {VV}-\sigma ^\text {pre}_\text {VV}\) and \(\sigma ^\text {post}_\text {VH}-\sigma ^\text {pre}_\text {VH}\) represent the log-ratio information for VV and VH, respectively. From the previous studies [1], we know the log-ratio is useful for SAR image change detection. Based on radar remote sensing physics knowledge, the SARCFMNet model fuses temporal, log-ratio, and polarization together, denoted as DUAL+Diff. This approach can increase the accuracy and reliability of the DCNN model, making it appropriate for coastal flooding mapping from SAR remote sensing data. The fused input information sources are integrated as a data cube. This design can realize information fusion with little parameter increasing.

  2. 2.

    DCNN-suitable regularization design–For DCNN models, such as the proposed SARCFMNet, the models’ ability to generalize is limited by model overfitting. When a model overfits, it might produce excellent results during the training phase but bad results when used in practice. Dropout is a suitable scheme to avoid overfitting for fully connected networks, although it is not so helpful for convolutional layers [8]. From the network design, we can find out there are no fully connected layers in the model. This is a fully convolutional model. In this case, we should use a dropout means which is effective for convolutional layers. Here, we include the SpatialDropout2D (SD2D) layer to leverage channel-level dropout to accomplish regularization and increase the model’s generalization ability, as inspired by Tompson et al. [30].

The model can be generalized for multiple problems. The model can be abstracted as a design in Fig. 3b. There are five modules: (1) module 1 for encoding; (2) module 2 for decoding, module; (3) module 3 for generating high-level bottleneck features; (4) module 4 for outputting predictions with adaptive processes; (5) module 5 for fusing feature information between skip connections. With suitable information input and specially-tailored modifications, this abstracted model can realize multi-task pixel-level ocean remote sensing image information mining [15].

4 Performance Evaluations and Discussions

4.1 Performance Evaluations

The six image pairs yield 10049 samples, as shown in Table 1. We randomly choose roughly 20% of the samples in each pair to create a sub-dataset with 2000 samples. This dataset is denoted as the S2000 dataset. The S2000 dataset is used for model training. During the training process, \(70\%\) (1400 samples) are randomly selected for training, and the other \(30\%\) (600 samples) are selected for validation. There is a hyperparameter for the SpatialDropout2D layer, that is the dropout rate. We set the dropout rate as 0.5. Thus, the results from the model with the SpatialDropout2D can be identified by _SD2D0.5. The model training and testing are implemented by the software framework Keras. The optimizer for the model fitting is Adam. The batch size is 32. The total number of epochs is 300. The validation set determines the model parameters. We use one Nvidia GeForce GTX 1080Ti graphics processing unit (GPU) card. The training time on the S2000 dataset is about 6.7 hours.

The losses and accuracies for the training and validation are documented and analyzed. The readers can find the details in [18]. The conclusions drawn from the analyses are listed here.

  1. 1.

    In all the settings, the models are fully trained. With the indication of validation loss, the models try not to overfit.

  2. 2.

    The usefulness of the log-ratio information—From the performance comparison, the inclusion of the log-ratio information can improve the model’s performance for coastal inundation mapping.

  3. 3.

    The usefulness of the dual-polarization fusion—From the performance comparison, the fusion of the polarization information can improve the model’s performance for coastal inundation mapping. In addition, the VH polarization can get better performance than the VV polarization. The possible reason is that VH is less sensitive to the wind condition during the flooding mapping. We will discuss this later.

  4. 4.

    The usefulness of the DCNN-suitable regularization design—With the regularization layer suitable for the fully convolutional model, although the performance decreases in the training processing, the performance increases in the validation process. This indicates the regularization design can make the model more reliable.

Table 2 The trained SARCFMNet model’s performance on the dataset

On the dataset created in Sect. 2.2, the SARCFMNet trained on the S2000 dataset is applied. The results are given in Table 2. The input data and regularization scheme are indicated by the column names. The subset names are indicated by the row names. There are four numbers in each block of the table. Classification accuracy, recall, precision, and F1 score are listed in that sequence. The ratio of true positives to the total number of true positives and false negatives is recall. A greater recall score indicates that the model misses fewer areas that are actually flooded. The ratio of true positives to the sum of true positives and false positives is precision. A greater precision score indicates that the model is less likely to produce incorrect flooding areas. The F1 score is the harmonic mean of precision and recall, and it leans to the lower value within precision and recall. The weighted average is shown in the last row. The number of samples in each subset determines the weights. The best accuracy and F1 are emphasized by underline. From observation, the block with the best accuracy has the best F1 score. From this table, we can draw the consistent conclusions as shown above:

  1. 1.

    The fusion of dual-polarization information improves the model’s coastal inundation mapping performance.

  2. 2.

    The model gets better performance on VH polarization than VV polarization.

  3. 3.

    The DCNN-suitable regularization improves performance and robustness of the model.

Fig. 4
figure 4

Visual evaluation on the IW01 subset. a and b are the pre- and post-event NRCS images for the VV. c and d are the pre- and post-event NRCS images for the VH. e Ground truth. f Mapping prediction with the model (DUAL+Diff_SD2D0.5)

The visual evaluation on the IW01 subset, used as an example for presentation, is shown in Fig. 4. In this figure, the first and second rows show the images of the VV and VH. In these two rows, the first and second columns show pre- and post-event images, respectively. The ground truth and flooding prediction using the model (DUAL+Diff_SD2D0.5) are shown in Fig. 4e, f. By comparing Fig. 4e, f, we can observe that the mapping prediction is very close to the ground truth, indicating that the presented model is effective.

4.2 Spatial and Temporal Characteristics

After we apply the trained SARCFMNet model on the dataset described in Sect. 2, we can analyze the spatial and temporal characteristics of 2017 Harvey-induced coastal inundation.

The image pair SM01 is selected as a case for the geospatial analysis. The predicted flooding mapping is shown in Fig. 5a. In order to perform the geospatial analysis, we collect useful supporting data. They are shown in Fig. 6. The supporting data include: (1) The elevation data of the scene, from the United States Geological Survey (USGS) National Elevation Dataset [32], and shown in Fig. 6a; (2) The land cover types of the scene, from the 2016 National Land Cover Database (NLCD) [37], and shown in Fig. 6b with legend; (3) the historical water occurrence data of the scene, from the Global Surface Water Mapping Dataset (1984–2015) [24], and shown in Fig. 6c.

Fig. 5
figure 5

Subset SM01 for the geospatial analysis. a Coastal inundation prediction from the SARCFMNet model. b Inundation heat map generated from the prediction

Fig. 6
figure 6

Supporting data for the spatial analysis. a The elevation data of the scene, from the United States Geological Survey (USGS) National Elevation Dataset; b The land cover types of the scene, from the 2016 National Land Cover Database (NLCD); c the historical water occurrence data of the scene, from the Global Surface Water Mapping Dataset (1984–2015)

Fig. 7
figure 7

Geospatial analysis for the SM01 subset. a The elevation distribution of the flooded and non-flooded areas. b The proportion of land cover types affected by the flooding. c The historical water occurrence of the flooded areas

Fig. 8
figure 8

Temporal analysis of SARCFMNet-generated flooding maps of IW01 (August 29), IW05 (August 30), SM01 (September 4), and IW04 (September 5). a A Moderate Resolution Imaging Spectroradiometer (MODIS) image shows Harvey in August 26, 2017. b The flooding duration probability of the region, which is illustrated as green rectangle in a. c Temporal flooding transition from IW01 to IW05 of the region, which is illustrated as red rectangle in b. d flooding area proportions

We can derive certain geospatial analytic findings with the mapping predictions and supporting data:

  1. 1.

    General analysis—In this scene, the total flooding area is about \({284}\,\textrm{km}^{2}\) (about \(3\%\) of the scene). We use a disk-shape average filter (radius = 100 pixels) to process the flooding map, and create a flooding heat map for the scene. In Fig. 5b, the heat map is placed onto the pre-event image. In the southern part of Houston, severely flooded regions are densely scattered, as shown in the heat map.

  2. 2.

    Relation with elevation—The elevation distribution of the flooded and non-flooded areas in the scene is analyzed and shown in Fig. 7a. It shows the elevation distribution of the flooded areas is different from that of the non-flooded areas, and the former is obviously lower. It is more likely that flooding occurs and remains in lower regions, in this scene, the southern part.

  3. 3.

    Relation with land cover types—The proportion of land cover types affected by the flooding is illustrated in Fig. 7b. It demonstrates that pasture and cultivated crops are the dominant land cover types in flooded regions. They account for more than \(76\%\) of flooding. They are the main land cover types in the southern part which is severely flooded. The flooding may severely damage local agriculture. However, we have to realize that even if the hurricane caused severe flooding in the urban areas, the inner city flooding cannot be easily extracted by pure image-based analysis.

  4. 4.

    Relation with historical water occurrence—The flooding is extracted from analyzing the pre- and post-event images. We have to be sure that the flooding is not caused by seasonal or periodic surface water increasing. For the flooded areas, the historical water occurrence is analyzed and shown in Fig. 7c. It reveals that, in flooded areas, the historical water occurrence is extremely close to zero. It signifies that the predicted flooding is abnormal, and people should be alert to it.

For SM01, IW01, IW04, and IW05, the mapping products have an overlapping region. The multi-temporal study of the mapping results will be performed in this region. Figure 8 depicts the temporal analysis. Figure 8a shows a Moderate Resolution Imaging Spectroradiometer (MODIS) image of Harvey in August 26, 2017. The region for temporal analysis is illustrated as the green rectangle. The flooding duration probability of the overlapping zone is shown in Fig. 8b. It can assist us in comprehending the temporal evolution of floods. The locations with the highest probability are likely to be the last to vanish. Pixels with a probability \(< 0\) lack all of the mapping products needed for temporal analysis.

Figure 8d shows flooding area proportions of the product sequence, IW01 (August 29), IW05 (August 30), SM01 (September 4), and IW04 (September 5). It shows how the flooded areas in the region reduce over time as the product sequence progresses. We may calculate that the shrinkage rate is around \(1\%\) of the region area (roughly \({23}\,\textrm{km}^{2}\)) each day using regression analysis.

We discover a phenomenon of delayed flooding after the examination of the product sequence. One area does not show flooding in IW01 (August 29), but shows flooding in IW05 (August 30). In Fig. 8b, a region is marked in a red rectangle. In Fig. 8c, the temporal flooding transition from IW01 to IW05 is examined . The region is in Glen Flora, Texas . We check the news [11] and discover that the Colorado River (Texas) began flowing through and across the region on the evening of August 29 (local time). This is why the flooding is captured by IW05 (sensing time: 12:22 UTC, August 30), but not by IW01 (sensing time: 00:26 UTC, August 29). The reason for the delayed flooding deserves further studying.

4.3 Discussions of Performance

In this part, we discuss the performance of the proposed model in two aspects: first, the influence of wind; and second, cost-sensitive losses to adjust the performance.

Table 3 Toy experiment results of wind influence

The influence of wind is a factor seldom discussed in flooding mapping. However, in the case of storm-induced coastal flooding mapping, this is a practical issue. In order to map coastal flooding, we may encounter the following scenario: the storm has already generated coastal flooding, which is captured by remote sensing data; nevertheless, the storm has not yet left the scene and is still delivering strong winds. In this case, the wind can have adverse effects on inundation mapping, since the strong wind increases the water areas’ backscattering. In this study, we also face this situation. Strong wind influences the IW01’s post-event image, as described in Sect. 2.1. We use a toy example to demonstrate the impact of wind on the performance of the DCNN model.

400 samples are chosen from IW01 and IW03 to create IW01_selected and IW03_selected. Then, using the DUAL+Diff architecture, we train two models on IW01_selected and IW03_selected, and test them on IW01 and IW03. There are four scenarios here: (1) IW01_selected training, IW01 testing; (2) IW01_selected training, IW03 testing; (3) IW03_selected training, IW01 testing; (4) IW03_selected training, IW03 testing. Table 3 contains their results. The column names in this table denote the training subsets, whereas the row names denote the testing subsets. The numbers in each block indicate classification accuracy, recall, and precision. The numbers on the table’s diagonal show training scene is the same as testing scene. The performances are excellent for obvious reasons. Outside of the table’s diagonal, performance drops sharply. The model of IW01_selected training and IW03 testing presents low precision. We may deduce from the aforementioned information that the IW01_selected’s post-event imaging is impacted by severe wind. In this situation, the subset IW01_selected convinces the model to find flooding areas with higher backscattering in the post-event image. This will cause false positive predictions and lower precision, if the model is tested on IW03. Based on the similar logics, we may understand the model of IW03_selected training and IW01 testing presenting low recall.

Based on the explanations from the toy example, we can get a better understanding of the total performance evaluation listed in Table 2. We can have two observations.

  1. 1.

    The model trained on VH polarization has better performance than that trained on VV polarization. The possible explanation is VH is less sensitive to the wind conditions.

  2. 2.

    The S2000 dataset is created from data of different wind conditions, the model trained on the S2000 dataset performs better in terms of balance. However, the results, particularly those tested on IW01 and IW03, still show the impact of wind. This tells us, in the future research, the DCNN models should be aware of the wind conditions. It is a direction to further improve the performances.

    1. a.

      Since VH is less sensitive to wind conditions, the model can only use VH polarization. However, we can not deny that VV has its own advantages for flooding mapping. Maybe this is a design with much information loss.

    2. b.

      The wind information can be input together with the image information, and the dual-polarization information fusion can be realized in a more flexible way.

In the deep learning-based paradigm for image understanding, the loss functions play an important role. They set the end rules for the models, making the predictions close to targets. The closeness is measured by losses. In this study, the BCE loss is useful and suitable for binary classification. The BCE loss can be adjusted according to user-defined costs. Accordingly, the performances will be adjusted. We use the toy experiments of two models with the DUAL+Diff design, one of IW01_selected training and IW01 testing, and one of IW03_selected training and IW03 testing.

The BCE loss is used first, and the results are shown in Table 4. The numbers in each block are classification accuracy, recall, and precision. From the first row, BCE is capable of balancing accuracy and recall.

Table 4 Toy experiment results of cost-sensitive losses

In real applications, the users may have personalized needs, higher recall or higher precision. These personalized needs can be understood as cost-defined requests. If users believe that the cost of low recall is very great, the model must improve recall at the price of precision. Based on the similar logics, if users believe that the cost of poor precision is very great, the model must improve precision at the price of recall. To meet these requirements, cost-sensitive losses are utilized. The type-defined weighted \(\alpha \)-balanced BCE (\(\alpha \)BBCE) loss [17] is one technique to build cost-sensitive losses:

$$\begin{aligned} L_{\alpha \text {BBCE}} = - \frac{1}{N} \sum _{i=1}^{N} \left\{ \alpha y_i \log \hat{y}_i + (1-\alpha ) (1-y_i) \log \left( 1-\hat{y}_i\right) \right\} \end{aligned}$$
(1)

where the ith pixel’s label is denoted as \(y_i\); the prediction is denoted as \(\hat{y}_i\); the pixel number is denoted as N; the weight is denoted as \(\alpha \in [0, 1]\). The accuracy of flooding is given more weight during training as \(\alpha \) is higher. Given that the flooding pixel number is significantly less than non-flooding pixel number, a larger value for \(\alpha \) is appropriate. The value of \(\alpha \) in this experiment is 0.8. The \(\alpha \)BBCE loss is effective, as seen in Table 4’s second row. Because the accuracy of flooding is given more weight during training, recall is increased at the price of precision.

Another technique to build a cost-sensitive loss is to utilize the F\(\beta \) score directly:

$$\begin{aligned} L_{\text {F} \beta } = 1 - \underbrace{\left( 1+\beta ^2\right) \frac{P \cdot R}{\beta ^2 \cdot P + R}}_{\text {F}\beta \ \text {score}} \end{aligned}$$
(2)

where R and P is recall and precision, respectively. \(\beta \) is a positive real weight. Minimizing the F\(\beta \) loss can increase the F\(\beta \) score. If \(\beta \) is greater than 1, optimizing recall receives more attention during training. If \(\beta \) is less than 1, precision optimization is given more attention. This is clearly a more direct way of controlling the recall and precision in the results by their importance. The 3rd and 4th rows of Table 4 show the results of the F\(\beta \) loss, \(\beta = 2\) and \(\beta = 0.5\), respectively. The results confirm that the F\(\beta \) loss is an effective way for adjusting recall and precision in the results according to their importance: (1) For \(\beta = 2\), recall is increased at the price of precision; (2) For \(\beta = 0.5\), precision is increased at the price of recall.

Here, we use toy examples to show the design of cost-sensitive losses. There are two points should be aware of for designing these losses.

  1. 1.

    There is a performance tradeoff between recall and precision (there is no free lunch).

  2. 2.

    There is one more hyper-parameter should be pre-defined. This one more hyper-parameter gives us more control over the performance.

5 Application Case in Bangladesh

Bangladesh is a participating country in the Belt and Road Initiative, and it is one of the UN-defined least developed countries. Bangladesh is located on the coast of the Indian Ocean and has a low-lying terrain. Under the influence of the rainy season and tropical cyclones, severe flooding occurs every summer, especially from June to October. Flooding poses a huge threat to the safety of people’s lives and property in the country, and has become an obstacle to the country’s development. This chapter uses the SARCFMNet model to carry out a nation-level, multi-year, high-temporal-resolution flooding mapping of Bangladesh from 2016 to 2020. This can deepen our understanding of the flooding mechanism in Bangladesh, and provide powerful technology and data support for disaster mitigation and flood forecasting.

In order to provide the nation-level, multi-year, high-temporal-resolution flooding mapping products for Bangladesh from 2016 to 2020, we use the following processing for Sentinel-1 data based on the preprocessing introduced in Sect. 2.2.

  1. 1.

    For each year, we select images from February to March of that year to put together a nation-level pre-event image.

  2. 2.

    For each year, from a time window of each month from June to October, we select images to put together a nation-level post-event image.

  3. 3.

    The SARCFMNet model trained on the S2000 dataset is performed on the image pairs to get the nation-level flooding mapping result.

From the aforementioned steps, we provide 45 nation-level flooding mapping results:

  1. 1.

    Year 2016: From June to October, one nation-level flooding map is provided every month.

  2. 2.

    Year 2017: From June to October, two nation-level flooding maps are provided every month, the first and second halves of the month.

  3. 3.

    Year 2018: From June to October, two nation-level flooding maps are provided every month, the first and second halves of the month.

  4. 4.

    Year 2019: From June to October, two nation-level flooding maps are provided every month, the first and second halves of the month.

  5. 5.

    Year 2020: From June to October, two nation-level flooding maps are provided every month, the first and second halves of the month.

In 2016, since the temporal resolution of Sentinel-1 SAR data is relatively low, there is one nation-level flooding map per month. From 2017 to 2020, there are two nation-level flooding maps per month.

Fig. 9
figure 9figure 9

Bangladesh nation-level flooding occurrence probability maps from 2016 to 2020. The flooding occurrence probability map is generated from the flooding maps of each year from June to October

The flooding maps have the following characteristics:

  • Spatial extent: Bangladesh

  • Temporal extent: 2016–2020

  • Spatial resolution: 3 arcsecond, consistent with main world-level DEM products

  • Temporal resolution: half a month (For 2016, a month)

Based on the flooding maps, we first analyze the flooding occurrence probability each year, which is shown in Fig. 9. The flooding occurrence probability map is generated from the flooding maps of each year from June to October. This shows that, each year, the spatial distribution of high flooding occurrence probability is relatively stable. For each year, there are some flooded areas, which are not flooded areas for other years. Based on the products, these phenomena can be analyzed case by case.

Fig. 10
figure 10

Bangladesh nation-level flooding extent area from 2016 to 2020. In 2016, in the label of x-axis, the 00 after month means there is one flooding map product each month. From 2017 to 2020, in the label of x-axis, the 01 after month means the product of the first half of the month, and 02 means the second

Based on the flooding maps, we then analyze the flooding extent each year, which is shown in Fig. 10. From this analysis, we can get the following information:

  1. 1.

    We already know that the flooding mainly happens from June to October in Bangladesh due to rainy season and tropical cyclones. In the yearly flooding extent from 2016 to 2020, we can narrow down that the most severe flooded time window is from the second half month of July to the first half month of August.

  2. 2.

    For each year from 2016 to 2020, the peak flooding area is around \({2e4}\,\textrm{km}^{2}\).

The above two analyses just show the usefulness of the provided flooding maps. By using nation-level, multi-year, high-temporal-resolution flooding mapping products of Bangladesh from 2016 to 2020, we can perform more spatial and temporal, targeted analyses. Hopefully, this can deepen our understanding of the flooding mechanism in Bangladesh, and provide powerful information support for disaster mitigation and flood forecasting.

6 Conclusions

The SARCFMNet model of mining multi-temporal and dual-polarimetric SAR data for coastal inundation classification is presented in this chapter. The SARCFMNet is built on U-Net, a benchmarking deep learning model for pixel-level classification that we have modified for the challenges of coastal inundation mapping from SAR imagery: 1) radar remote sensing physics-driven input information design; and 2) regularization suitable for fully convolutional networks. We present two study cases in this chapter. First, the SARCFMNet is trained and evaluated using a dataset derived from 2017 Hurricane Harvey-influenced Houston, Texas. Six image pairs, with ground truth delineated by human with the help of Google Earth and OpenStreetMap, are used to test the proposed SARCFMNet model. The average mapping accuracy and F1 score are 0.98 and 0.88, respectively. They are better than the benchmarking deep learning model for pixel-level classification. This verifies the usefulness of the proposed designs. The geospatial study of Harvey-caused floods is performed using the flooding predictions and indicates Harvey’s massive impact on agriculture. The multi-temporal study estimates the flooding decreasing rate and uncovers a delayed-inundation phenomenon. Second, the trained and verified SARCFMNet model is applied to Bangladesh, which is one of the UN-defined least developed countries, to get nation-level, multi-year, high-temporal-resolution flooding maps. The flooding maps of Bangladesh are from 2016 to 2020, with spatial resolution of 3 arc second and temporal resolution of half a month (for 2016, a month). This can help us get deeper understanding of the flooding mechanism of this country. In addition, impact of meteorological factors in DCNN-based flooding mapping models and cost-sensitive losses are discussed. We propose that this model can be easily and readily generalized to other multi-temporal ocean remote sensing imagery information mining problems.