Oil Spill Identification based on Dual Attention UNet Model Using Synthetic Aperture Radar Images

Mahmoud, Amira S.; Mohamed, Sayed A.; El-Khoriby, Reda A.; AbdelSalam, Hisham M.; El-Khodary, Ihab A.

doi:10.1007/s12524-022-01624-6

Oil Spill Identification based on Dual Attention UNet Model Using Synthetic Aperture Radar Images

Research Article
Open access
Published: 20 November 2022

Volume 51, pages 121–133, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of the Indian Society of Remote Sensing Aims and scope Submit manuscript

Oil Spill Identification based on Dual Attention UNet Model Using Synthetic Aperture Radar Images

Download PDF

Amira S. Mahmoud ORCID: orcid.org/0000-0003-2738-0819¹,
Sayed A. Mohamed¹,
Reda A. El-Khoriby²,
Hisham M. AbdelSalam² &
…
Ihab A. El-Khodary²

3070 Accesses
Explore all metrics

Abstract

Oil spills cause tremendous damage to marine, coastal environments, and ecosystems. Previous deep learning-based studies have addressed the task of detecting oil spills as a semantic segmentation problem. However, further improvement is still required to address the noisy nature of the Synthetic Aperture Radar (SAR) imagery problem, which limits segmentation performance. In this study, a new deep learning model based on the Dual Attention Model (DAM) is developed to automatically detect oil spills in a water body. We enhanced a conventional UNet segmentation network by integrating a dual attention model DAM to selectively highlight the relevant and discriminative global and local characteristics of oil spills in SAR imagery. DAM is composed of a Channel Attention Map and a Position Attention Map which are stacked in the decoder network of UNet. The proposed DAM-UNet is compared with four baselines, namely fully convolutional network, PSPNet, LinkNet, and traditional UNet. The proposed DAM-UNet outperforms the four baselines, as demonstrated empirically. Moreover, the EG-Oil Spill dataset includes a large set of SAR images with 3000 image pairs. The obtained overall accuracy of the proposed method increased by 3.2% and reaches 94.2% compared with that of the traditional UNet. The study opens new development ideas for integrating attention modules into other deep learning tasks, including machine translation, image-based analysis, action recognition, and speech recognition.

TAFDet: A Task Awareness Focal Detector for Ship Detection in SAR Images

Triple-strip attention mechanism-based natural disaster images classification and segmentation

Article 18 June 2022

Oil Spill Discrimination of SAR Satellite Images Using Deep Learning Based Semantic Segmentation

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The frequent oil discharge from ships or oil platforms has become a major threat to our coastal ecosystem and may generate large economic losses for maritime activities interrupted by this pollution (Migliaccio et al., 2007). To manage and minimize oil spills, they should be first identified. Satellite imaging can be an efficient tool for this purpose. Remote sensing imageries, either optical or radar, have been extensively utilized in oil spill detection. Various studies have documented the importance of SAR in oil spill detection tasks (Cantorna et al., 2019).

Polarimetric SAR offers a massive range of features that effectively enhance oil spill extraction and detection (Zhang et al., 2017). Radar systems mounted on aircraft and satellites provide images of sea and land surfaces (Bovenga, 2020). The SAR sensor sends out radio waves, which are reflected off the surfaces and used to create a visual interpretation of the target surface. Numerous approaches have been introduced in the literature to automatically identify oil spills; however, the SAR image feature extraction through the traditional approach has become a major drawback limiting the performance (Al-Ruzouq et al., 2020; Kolokoussis & Karathanassi, 2018). The feature extraction becomes increasingly cumbersome as the number of classes to classify increases. Expert judgment, along with several trial-and-error processes, decides which features best describe different object classes. Moreover, each feature definition requires dealing with a plethora of parameters, which must be fine-tuned.

Deep Convolutional Neural Networks (DCNNs) have impressively boosted the performance in many fields, such as: change detection (Mahmoud et al., 2021), super resolution (Moustafa & Sayed, 2021), hazard assessment, and object detection (Mahmoud et al., 2020). Because of the ability of recent architectures to explore significant multilevel deep features, Convolutional Neural Networks (CNNs) have been incorporated to efficiently solve complex functions (Chen et al., 2017). Attention mechanisms have recently become one of the most important concepts in the deep learning field. Attentions are typically inspired by the biological systems of humans, which tend to focus on the distinctive parts when processing large amounts of information. In diverse CNN architectures, several attention mechanisms have been widely used to improve the representative ability of these architectures by strengthening feature extraction. Examples of widely used attention mechanisms include channel-wise attention (Zhu et al., 2019), squeeze and excitation attention (Hu et al., 2018), and the pyramid attention model (Mei et al., 2020), among many others.

This study proposes a novel approach for identifying and segmenting oil spills based on the residual UNet model and attention learning for SAR satellite imageries. It incorporates (DAM) (Fu et al. 2019) which is composed of channel-wise attention (Zhu et al., 2019) and position attention (Zhang et al., 2020) to amplify the useful features to improve the oil detection accuracy. Both attention models learn the channel interdependencies and spatial interrelations of features that allow the proposed model to selectively emphasize the informative and discriminative features of an oil spill. The proposed approach accurately detects oil spill pixels with 94.2% accuracy, 89% precision, 88.4% recall, and 86.3% F1 score. It provides an improved accuracy compared with other architectures. The major contributions of this study are as follows:

The DAM block is plugged into the UNet architecture to identify oil spills using SAR images.
A weighted cross-entropy loss function for handling the imbalanced distribution of the oil spill dataset is used.
A newly collected SAR oil spill dataset, called EG-OilSpill, is used to test our technique against other baseline methods, including the traditional UNet without the DAM.

The remainder of this paper is structured as follows: Sect. 2 reviews the related published literature on deep learning for oil spills; Sect. 3 describes the suggested technique for identifying oil spills using satellite images; Sect. 4 discusses the experiment design and findings; and Sect. 5 draws the conclusions.

Related Work

Several attempts have been conducted to boost the CNN performance in the oil spill identification problem (Krestenitis et al., 2019a, 2019b; Li et al., 2022; Rousso et al., 2022). However, a limited SAR-labeled dataset means that data scarcity is when there is a limited amount, or a complete lack of labeled training data, that limits the performance and the generalization of a sophisticated deep learning framework. Attention mechanism has recently become a popular strategy to boost performance. This section reviews recent studies on CNN on oil spills, formulates knowledge distillation, and presents the recent work to attention modules.

Neural Networks for Oil Spill Identification

Several works have adopted semantic segmentation using deep CNNs to detect oil slicks on the sea surface. Object detection, which is a subset of computer vision, is an automated method for locating essential objects in an image with respect to the background (Fang et al., 2020). Object detection has been adopted in many real-life applications, such as in human–computer interaction (Singh & Singh, 2018), robotics (Hwang et al., 2019), consumer electronics (e.g., smartphones) (Liu et al., 2019), image retrieval (Moustafa et al., 2020), and transportation (e.g., autonomous and assisted driving) (Feng et al., 2020). Deep learning is a state-of-the-art method used in recent object detection studies (Sharma & Mir, 2020).

In recent years, various deep learning architectures have been introduced for oil spill detection (Table 1). Ref. (Krestenitis et al., 2019b) evaluated various DCNN architectures, namely: UNet, LinkNet, Pyramid Scene Parsing Network (PSPNet), DeepLabv2, and DeepLabv3+, for oil spill detection and concluded that DeepLabv3+ yields the best results in terms of the test set accuracy and associated inference time in oil spill detection. Song et al. (2020) designed a deep CNN appropriate for oil spill detection to automatically extract deep features from the PolSAR data. Meanwhile, Guo et al. (2018) made a comparison between classical and deep learning methods and concluded that deep learning methods (e.g., SegNet and fully convolutional network [FCN]), which use semantic segmentation for oil spill detection, outperform other classical methods such as support vector machine and random forest. Yan et al. (2019) proposed a multifunction fusion neural network to detect oil spills as ocean phenomena. This proposed method achieved the highest detection accuracy. DCNN was built to analyze the SAR Dark Patch Classification in oil spill detection. The proposed method reported a higher accuracy in detecting oil spills and lookalikes (Zeng & Wang, 2020) by establishing the best initial values of the wavelet neural network (WNN) for the oil spill classification, Song et al. (2017), in their experimental results, showed that the optimized WNN classification will largely improve the ocean oil spill classification. In Cantorna et al. (2019), CNN was used to perform automated oil spill detection aside from classical segmentation methods. Various recurrent neural networks (RNNs) were implemented and tested on Side-Looking Airborne Radar (SLAR) images for candidate oil spill detection, and these RNNs achieved higher accuracy (Alacid et al., 2017). Deep learning algorithms, such as sparse autoencoders and deep belief networks, were used for oil spill segmentation, outperforming other traditional methods (Chen et al., 2017). DCNN models can automate contaminated area detection and achieve a higher performance (Krestenitis et al., 2019a). In this study, we propose SAR image recognition-based on CNN and report a higher accuracy for oil spill tracking (Xiong & Zhou, 2019).

Table 1 Previous works on oil spill identification methods using deep learning

Full size table

Attention Modules

Attention is a complicated and essential cognitive function for humans (Corbetta & Shulman, 2002). One important feature of human perception is not being able to process all information at once. For example, people do not usually see all scenes from start to finish when they perceive things visually; instead, they observe and pay attention to certain parts as required. This technique helps in selecting high-value information with relevance to the limited available processing resources. An attention mechanism significantly increases the efficiency and precision of information processing. This mechanism may be used as a resource allocation that is the principal means for resolving information overload. With limited computing power, information with limited computational resources can be processed more efficiently. Therefore, some researchers are concerned with the area of computer vision. An attention mechanism can also be used to explain incomprehensible neural architecture behavior and performance improvements. Despite boosting the performance of the neural network in many areas (e.g., financial (Zhang et al., 2019), material (Ieracitano et al., 2020), meteorology (Yu et al., 2015), medical (Liu et al., 2018), and autonomous driving (Ming et al., 2021)), interpretability has remained a major problem. Whether or not the attention mechanism can be effectively used to explain a deep network remains a topic of dispute (Li et al., 2022; Fu et al. 2019).

The Proposed Method

This section presents a detailed description of the proposed hybrid attention model for oil spills in SAR images. Figure 1 shows a graphical representation of the overall structure of the proposed framework. The proposed approach typically alters the traditional UNet architecture by incorporating dual attention to effectively suppress the information flow and minimize false identifications (Fig. 1a). The advantages of (PAM) and (CAM) are combined to gather contextual information better than the original UNet (Fig. 1b). We also included a tailored loss function.

Traditional UNet

The traditional UNet (Ronneberger et al., 2015) is an FCN architecture extension initially proposed in 2015 for biomedical image semantic segmentation. It is now widely used in various applications. UNet has two blocks: an encoder and a decoder. The term “U” basically represents the symmetric between the encoder and decoder blocks. The encoder block aims to capture an image’s meaningful feature map, whereas the decoder network upsamples the extracted feature map while decreasing its filters. The original UNet architecture includes four stages in both the encoder and decoder blocks. In each encoder stage, two 3 × 3 convolutions are applied repeatedly, and after each one, a rectified linear unit (ReLU) and a 2 × 2 max pooling operation with stride 2 are applied for downsampling. Every step in the decoder path consists of an upsampling of the feature map followed by a 2 × 2 convolution that halves the number of feature channels, a concatenation with the correspondingly cropped feature map from the encoding path, and two 3 × 3 convolutions, each followed by a ReLU (Ronneberger et al., 2015; Lou et al., 2021). To improve the traditional UNet model, we incorporate the DAM to suppress the feature activations from irrelevant backgrounds. This will be discussed in detail in the next section.

Dual Attention Model

Both PAM and CAM have integrated to capture long-range contextual information in the spatial and channel dimensions (Fu et al. 2019). The DAM that incorporates global spatial and channel dimension interdependence could boost the oil identification accuracy compared with the original UNet. Figure 2 presents the details of the DAM.

Position Attention Map

Figure 2a illustrates the PAM architecture in detail. The local input feature A ∈ ${R}^{CxHxW}$ was fed into three convolution layers to generate three new feature maps (i.e., B, C, and D), where $\left\{\mathrm{B},\mathrm{ C},\mathrm{ D}\right\}\in {R}^{CxHxW}$. Next, B, C, and D were reshaped to ${R}^{CxN}$, where $N\in H\times W$. A matrix multiplication was implemented between B and the transpose of C. A SoftMax layer was then applied to calculate the spatial attention map ${\mathrm{S}\in R}^{NxN}$, as shown in Eq. (1)

$${s}_{ji}=\frac{\mathrm{exp}({B}_{i}.{C}_{j})}{{\sum }_{i=1}^{N}\mathrm{exp}({B}_{i}.{C}_{j})}$$

(1)

where $i\mathrm{th}$ position affects the jth position (Fu et al. 2019).

Matrix multiplication was performed between a D and the transpose of S. The result was then reshaped to ${R}^{CxHxW}$ and multiplied by a scaler α. Finally, the element-wise sum operation was applied to obtain the final output ${\mathrm{E}\in R}^{CxHxW}$, as shown in [Eq. (2)].:

$${E}_{PAM}^{j}=\alpha \sum_{i=1}^{C}({s}_{ji}{D}_{i})+{A}_{j}$$

(2)

where $\alpha$ tends to start as 0 and gradually allocated with more weight.

Channel Attention Map

Figure 2b illustrates the CAM architecture in detail. High-level features were regarded as class-specific responses. The different semantic responses were associated with each other. Each channel mapped high-level features that were considered a class response, and distinct semantic responses were linked. By utilizing the interdependencies between the channel maps, interdependent maps may be emphasized, and specific semantic features may be improved. Therefore, we constructed herein a module for channel attention to explicitly model the interdependencies between channels. We directly calculated the channel attention map X $\epsilon$ ${R}^{CxC}$ from the original feature map A ${R}^{CxN}$. Finally, a SoftMax layer was applied to obtain the channel attention map X ${R}^{CxC}$, as shown in Eq. (3):

$${x}_{ji}=\frac{\mathrm{exp}({A}_{i}.{A}_{j})}{{\sum }_{i=1}^{C}\mathrm{exp}({A}_{i}.{A}_{j})}$$

(3)

where ${x}_{ji}$ measures the $i\mathrm{th}$ channel’s impact on the jth channel. Scale parameter β was initialized as 0 and gradually learned to assign more weight. The final output feature map E was calculated as follows:

$${E}_{CAM}^{j}=\beta \sum_{i=1}^{C}({x}_{ji}{A}_{i})+{A}_{j}$$

(4)

Finally, the outputs of both attention maps were combined using one of the four approaches of addition, max-out, multiplication, and concatenation. In this study, we adopted the addition operation and used

$${E}_{DAM} = {E}_{PAM}^{j} + {E}_{CAM}^{j}.$$

Loss Function

In the training phase, the loss function was used to guide the network to learn meaningful predictions close to the ground truth in terms of segmentation metrics (Ma et al., 2021). By measuring the dissimilarity between the ground truth and the predicted segmentation, loss functions play a critical role in the CNN based on segmentation methods. A weighted binary cross-entropy loss function was adopted for the imbalance dataset during the training of the proposed approach and influenced the performance. Generally, an imbalance can occur in two ways: imbalance from front to background and imbalance from front to object. In our case, the water class was over 70% of the training dataset, whereas oil represented approximately 30%.

Cross-entropy loss (CE) (Yi-de et al., 2004) is defined as the difference between two probability distributions for a given random variable or a sequence of occurrences specified as a measurement. It is often used for classification purposes and segmentation. Binary cross-entropy loss (BCE) is defined as in Eq. (5):

$$L_{{BCE = - ({\text{ylog}}\left( {\hat{y}} \right) + \left( {1 - {\text{y}}} \right){\text{log}}\left( {1 - \hat{y}} \right)}}$$

(5)

where y is the ground truth and ŷ is the predicted value.

The weighted binary cross-entropy loss (WCE) (Pihur et al., 2007) is a version of the binary cross-entropy commonly adopted in skewed data situations. The WCE is defined as in Eq. (6):

$$L_{{WCE = - (\upbeta *{\text{ylog}}\left( {\hat{y}} \right) + \left( {1 - {\text{y}}} \right){\text{log}}\left( {1 - \hat{y}} \right)}}$$

(6)

where the β value is used to tune false negatives and false positives.

Experimental Results

Section 4.1 presents a brief description of the datasets. Section 4.2 describes the environmental setup. Section 4.3 illustrates the evaluation metrics. Section 4.4 discusses the experimental results and findings.

Datasets

The oil spill incident considered in this study occurred off the coast of Saudi Arabia, approximately 96 km from the Saudi coast of Jeddah as shown in (Fig. 3), between ${36}^{^\circ }{03}^{\mathrm{^{\prime}}}{24}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}$ E longitude and ${21}^{^\circ }{41}^{\mathrm{^{\prime}}}{17}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}$ N latitude and ${41}^{^\circ }{31}^{\mathrm{^{\prime}}}{44}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}$ E longitude and ${19}^{^\circ }{08}^{\mathrm{^{\prime}}}{05}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}$ N latitude. This oil leakage hampered global shipping and trade in the Red Sea, which is connected to the major shipping routes worldwide.

Sentinel-1 images were collected between October 13 and 25, 2019, with a 20-m resolution. Accordingly, 5-m azimuth resolution, VV polarization, and Universal Transverse Mercator zone-37 North projection were collected by the European Space Agency (ESA). The imagery dates were selected on the basis of SAR data availability and by considering the limited time lapse after the incident. The SAR images from the Sentinel-1 satellites of the ESA were used to evaluate the algorithms and the specifications of the GRD Sentinel-1A data are presented in (Table 2). The case considered herein dealt with C-band radar images with a VV polarization that is considered appropriate for oil spill detection (Cantorna et al., 2019). Filipponi (2019) illustrates The preprocessing of Sentinel-1A as shown in (Fig. 4) which involves four phases: (1) extraction of the amplitude VV polarization, (2) application of radiometric correction on the amplitude VV polarization, (3) application of Lee speckle filtering, and (4) finally, step of SAR preprocessing, Geometric Terrain Correction is performed in order to compensate for geometric distortions. This step assures that the geometry of the layer will match or closely resemble the geometry of the physical world (Christiansen et al., 2018; Schubert et al., 2015). The DEM parameter of the Range Doppler terrain correction was set to "SRTM 3Sec." and chosen, and the DEM Resampling technique and Image Resampling method were set to Bilinear Interpolation (Arif & Akbar, 2005; Madaan & Kaur, 2019). A pixel spacing of 20 m was selected. The SAR data used were compiled from a dataset, called the EG-OilSpill dataset. This dataset starts with 440 images with size 256 × 256 and is annotated at the pixel level containing two classes (i.e., water and oil). The dataset contained a total of 3000 pairs of images. Statistically, almost 70% of the dataset represents water bodies, whereas the remaining 30% represents oil spills at the pixel level.

Table 2 Specifications of the Sentinel-1A data

Full size table

Environmental Setup

Table 3 presents the hardware and software equipment used to train the proposed oil spill identification approach which uses adam optimizer (Li et al., 2022). Training, validating, and test splitting (80/10/10) are used, respectively (Shaban et al., 2021). Early stopping for regularization to solve the problem of overfitting is adopted.

Table 3 Hardware and software configuration

Full size table

Evaluation Metrics

For the architecture evaluation, we used various evaluation metrics commonly used in object detection problems as shown in Table 4. The metrics of recall, precision, F1 score (Ozigis et al., 2019), and overall accuracy (OA) are presented in Eqs. (7)–(10), respectively (Mahmoud et al., 2021), where TP is true positive, FN is false negative, FP is false positive, and TN is true negative.

Table 4 Evaluation metrics

Full size table

Results and Findings

We conducted several experiments to evaluate the proposed DAM-UNet for the oil spill detection task. We also compared the performance of the proposed approach with those of the four baseline methods, namely, FCN (Long et al., 2015), traditional UNet (Ronneberger et al., 2015), LinkNet (Chaurasia & Culurciello, 2017), and multiscale pyramid network-based model (PSPNet) (Zhao et al., 2017). Table 5 compares and summarizes the architectures, backbones, and loss functions of each method.

Table 5 Comparisons of the baseline segmentation methods

Full size table

Table 6 illustrates the obtained quantitative comparison results between the four baseline methods in terms of the four metrics: OA, recall, precision, and F1 score using cross-entropy. The proposed DAM-UNet outperformed the other methods in terms of the OA (93.7%), precision (90.8%) (Shaban et al., 2021; Yekeen & Balogun, 2020), and F1 score (85.9%). PSPNet (92.8%) and FCN (92.3%) showed slightly lower OA scores than UNet (91%). The F1 score, precision, and recall illustrated the same pattern. The PSPNet performance in terms of recall achieved the highest value of 83.2%.

Table 6 Quantitative comparison between the different segmentation methods

Full size table

Figure 5 presents a comparison of the visual results of the aforementioned models (Alpers et al., 2017). The proposed Dual-UNet model achieved the best identification accuracy for both water bodies and oil spill regions with 93.7% OA. By contrast, the LinkNet model yielded the lowest accuracy with misclassified oil spill regions. The UNet model identified the sharp edges of the oil spill regions, but several pixels were misclassified with the water bodies. Both PSPNET and FCN showed approximately the same performance (i.e., 92.8–92.3%, respectively). Figure 6 depicts the selected samples of the EG-Oil Spill dataset using the proposed DAM-UNet.

We conducted various experiments to evaluate the impact of different loss functions, i.e., WCE, focal loss [52], and BCE [53], on the performance of the proposed DAM-UNet. Table 7 presents the obtained results in terms of the mIOU and OA for the varied loss function. The weighted binary cross-entropy achieved the best results in terms of the OA, depicting an increase of ~ 0.5%. The focal loss showed the worst performance with 92% OA. For the intersection over union (IoU), the proposed DAM-UNet adopting the weighted binary loss (WCE) achieved higher accuracy in the water bodies and oil spill classes than the adopted focal loss and BCE.

Table 7 Quantitative comparison of the different loss methods in terms of the intersection over union (%)

Full size table

The above-mentioned results imply that the DAM-UNet-WCE model attained the highest overall mIoU of 83.85%. For the “oil spill” class, the class of highest interest, which achieved an accuracy of 75.70% was higher than the maximum value reported by DAM-UNet, which adopted focal loss and BCE. The lowest performance in terms of the mIoU was equal to 81.55% and was reported using the DAM-UNet-focal loss method. In summary, utilizing the WCE loss in the proposed DAM-UNet enhances the overall accuracy compared with the traditional binary cross-entropy.

The Benchmark Oil Spill Dataset comprising a collection of satellite SAR images of oil-polluted areas obtained via the ESA database contains a set of 1112 images (Krestenitis et al., 2019a, b). The oil pollution records covered the period from September 28, 2015, to October 31, 2017. The SAR images were acquired from the Sentinel-1 European Satellite missions. Figure 7 presents data samples. To ensure data validity and inclusion of oil spills in the images, the European Maritime Safety Agency confirmed the oil spill events through the Cleanse Net service along with their geographic coordinates.

Table 8 shows the quantitative comparison results obtained between the four baseline methods in terms of the four metrics. The DAM-UNet surpassed all the other methods and achieved higher scores in the OA (93.5%), precision (88.0%), and F1 score (83.8%) compared with the UNet method, which attained 92.8% accuracy. The FCN, PSPNet, and LinkNet have lower OA scores of 92%, 92.2%, and 92.4%, respectively. For the other evaluation metrics, PSPNet achieved the highest recall score of 82%. Figure 8 illustrates a comparison of the visual results of the proposed DAM-UNet model, which attained the best identification accuracy for both water bodies and oil spills.

Table 8 Quantitative comparison of the various segmentation methods mentioned in this study

Full size table

Various experiments were implemented to evaluate the impact of different loss functions (i.e., WCE, focal loss, and BCE) on the performance of the proposed DAM-UNet. Table 9 lists the obtained results in terms of the OA and the intersection over union (%). Overall, all loss functions showed acceptable OA scores. The WCE achieved the best results for the OA metrics, showing an increase of ~ 0.3%. BCE and focal loss shared a similar pattern for all metrics. The BCE loss exhibited the worst performance with 92.7% OA. Using the WCE in the proposed DAM-UNet improved the overall accuracy compared with the conventional binary cross-entropy.

Table 9 Quantitative comparison of different loss methods in terms of the intersection over union (%)

Full size table

IoU has utilized (Table 9) for the water bodies and oil spill classes separately for the proposed DAM-UNet using different loss methods. Compared with DAM-UNet, which adopted focal loss and BCE, the proposed method that adopted the weighted binary loss (WCE) achieved a higher accuracy with an OA score of 93.80% and mIoU of 82.85% in both classes. For the oil spill class considered as the class of highest interest, the achieved accuracy of 73.70% was higher than the reported value by the DAM-UNet, which adopted focal loss and BCE. The lowest performance in terms of the mIoU was equal to 81% and achieved by the DAM-UNet-adopting focal loss function. In summary, utilizing the WCE in the proposed DAM-UNet enhances the overall accuracy compared with the traditional binary cross-entropy.

Conclusion

Oil spills are one of the most severe threats to our marine and coastal environment. Therefore, effective monitoring and early warning are essential in facing the danger and reducing environmental damage. SAR sensors can offer high-resolution images of areas, where possible oil spills may be detected. Remote sensing via SAR sensors plays an important role in achieving this goal. To automatically interpret SAR images and distinguish oil spill objects, we introduce herein a semantic segmentation method based on UNet and DAM, called DAM-UNet. The proposed approach takes advantage of two attention models (i.e., the channel attention map and position attention map) to learn the spatial interrelations of features and the selective global and local information to emphasize the informative and discriminative features of an oil spill. We adopted the weighted cross-entropy as the loss function to address the imbalance dataset problem. We also introduced a large-scale oil spill dataset, called the EG-OilSpill dataset. The obtained results highlighted the effectiveness of the proposed method in qualitatively and quantitatively detecting oil spills using the EG-Oil Spill and Benchmark Oil Spill datasets. In the future, accurate models trained on the generated dataset can be contained in a broader framework for the identification of oil spills and the decision making required to deal with them. In conclusion, segmentation techniques may be applied to other research areas that utilize remote sensing, such as fire detection or floods.

References

Alacid, Beatriz, Damian Mira, Pablo Gil, and Sergiu-Ovidiu Oprea. 2017. Candidate Oil Spill Detection in SLAR Data - A recurrent neural network-based approach. In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, 372–7.
Alpers, W., Holt, B., & Zeng, K. (2017). Oil spill detection by imaging radars: Challenges and pitfalls. Remote Sensing of Environment, 201, 133–147.
Article Google Scholar
Al-Ruzouq, R., Mohamed Barakat, A., Gibril, A. S., Kais, A., Hamed, O., Al-Mansoori, S., & Khalil, M. A. (2020). Sensors, features, and machine learning for oil spill detection and monitoring: A review. Remote Sensing, 12(20), 3338.
Article Google Scholar
Arif, F. & Akbar, M. (2005). Resampling air borne sensed data using bilinear interpolation algorithm. Paper Presented at the IEEE International Conference on Mechatronics, ICM'05.
Bovenga, F. (2020). special issue “synthetic aperture radar (SAR) techniques and applications.” Multidisciplinary Digital Publishing Institute.
Book Google Scholar
Cantorna, D., Dafonte, C., Iglesias, A., & Arcay, B. (2019). Oil spill segmentation in SAR images using convolutional neural networks a comparative analysis with clustering and logistic regression algorithms. Applied Soft Computing, 84, 105716.
Article Google Scholar
Chaurasia, A. & Culurciello, E. (2017). Linknet: Exploiting encoder representations for efficient semantic segmentation. Paper Presented at the IEEE Visual Communications and Image Processing (VCIP).
Chen, G., Li, Y., Sun, G. & Zhang, Y. (2017). Polarimetric SAR oil spill detection based on deep networks. Paper presented at the 2017 IEEE International Conference on Imaging Systems and Techniques (IST).
Christiansen, M. P, Laursen, M. S., Mikkelsen, B. F., Teimouri, N., Jørgensen, R. N., Sørensen, C. A. G. (2018). "Current potentials and challenges using Sentinel-1 for broadacre field remote sensing. arXiv:1809.01652.
Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews Neuroscience, 3(3), 201–215.
Article Google Scholar
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., & Lu, H., 2019. Dual attention network for scene segmentation. Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Fang, W., Ding, L., Love, P. E. D., Luo, H., Li, H., Pena-Mora, F., Zhong, B., & Zhou, C. (2020). Computer vision applications in construction safety assurance. Automation in Construction, 110, 103013.
Article Google Scholar
Feng, D., Haase-Schütz, C., L., Rosenbaum, Hertlein, H., Glaeser, C., Timm, F., Wiesbeck, W., & Dietmayer, K. (2020). "Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges." IEEE Transactions on Intelligent Transportation Systems.
Filipponi, F. (2019). Sentinel-1 GRD preprocessing workflow. Multidisciplinary Digital Publishing Institute Proceedings, 18(1), 11.
Google Scholar
Guo, H., Wei, G., & An, J. (2018). Dark spot detection in SAR images of oil spill using segnet. Applied Sciences, 8(12), 2670.
Article Google Scholar
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. Paper Presented at the Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition.
Hwang, J. J., Yu, S. X., Shi, J., Collins, M. D., Yang, T. J., Zhang, X., & Chen, L. C. (2019). SegSort: segmentation by discriminative sorting of segments. Paper Presented at the Proceedings of the IEEE International Conference on Computer Vision.
Ieracitano, C., Paviglianiti, A., Campolo, M., Hussain, A., Pasero, E., & Morabito, F. C. (2020). A novel automatic classification system based on hybrid unsupervised and supervised machine learning for electrospun nanofibers. IEEE/CAA Journal of Automatica Sinica, 8(1), 64–76.
Article Google Scholar
Kolokoussis, P., & Karathanassi, V. (2018). Oil spill detection and mapping using sentinel 2 imagery. Journal of Marine Science and Engineering, 6(1), 4.
Article Google Scholar
Krestenitis, M., Orfanidis, G., Ioannidis, K., Avgerinakis, K., Vrochidis, S., & Kompatsiari, I. 2019a. Early Identification of Oil Spills in Satellite Images Using Deep CNNs. Paper presented at the International Conference on Multimedia Modeling.
Krestenitis, M., Orfanidis, G., Ioannidis, K., Avgerinakis, K., Vrochidis, S., & Kompatsiaris, I. (2019b). Oil spill identification from satellite images using deep neural networks. Remote Sensing, 11(15), 1762.
Article Google Scholar
Li, X., Liu, X., Xiao, Y., Zhang, Y., Yang, X., & Zhang, W. (2022). An Improved U-Net Segmentation Model That Integrates a Dual Attention Mechanism and a Residual Network for Transformer Oil Leakage Detection. Energies, 15(12), 4238.
Article Google Scholar
Liu, L., Li, H., & Gruteser. M., (2019). Edge assisted real-time object detection for mobile augmented reality. Paper presented at The 25th Annual International Conference on Mobile Computing and Networking.
Liu, W., Wang, Z., Liu, X., Zeng, N., & Bell, D. (2018). A novel particle swarm optimization approach for patient clustering from emergency departments. IEEE Transactions on Evolutionary Computation, 23(4), 632–644.
Article Google Scholar
Long, J., Shelhamer, E. & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition.
Lou, A., Guan, S., & Loew, M., (2021). DC-UNet: rethinking the U-Net architecture with dual channel efficient CNN for medical image segmentation. Paper Presented at the Medical Imaging 2021: Image Processing.
Ma, J., Chen, J., Ng, M., Huang, R., Li, Y., Li, C., Yang, X. and Martel, A.L. (2021). "Loss odyssey in medical image segmentation." Medical Image Analysis:102035.
Madaan, E. S., & Kaur, S. (2019). Pre-processing of synthetic aperture radar sentinel-1 images for agricultural land. International Journal of Control and Automation, 12(5), 443–459.
Google Scholar
Mahmoud, A., Mohamed, S., El-Khoribi, R., & Abdelsalam, H. (2020). Object detection using adaptive mask RCNN in optical remote sensing images. International Journal of Intelligent Engineering System, 13(1), 65–76.
Article Google Scholar
Mahmoud, A. S., Mohamed, S. A., Moustafa, M. S., El-Khorib, R. A., Abdelsalam, H. M., & El-Khodary, I. A. (2021). Training compact change detection network for remote sensing imagery. IEEE Access, 9, 90366–90378.
Article Google Scholar
Mei, Y., Fan, Y., Zhang, Y., Yu, J., Zhou, Y., Liu, D., Fu, Y., Huang, T.S. and Shi, H., (2020). "Pyramid attention networks for image restoration." arXiv:2004.13824.
Migliaccio, M., Gambardella, A., & Tranfaglia, M. (2007). SAR polarimetry to observe oil spills. IEEE Transactions on Geoscience and Remote Sensing, 45(2), 506–511.
Article Google Scholar
Ming, Y., Meng, X., Fan, C., & Hui, Yu. (2021). Deep learning for monocular depth estimation: A review. Neurocomputing, 438, 14–33.
Article Google Scholar
Moustafa, M. S., Ahmed, S., & Hamed, A. A. (2020). Learning to hash with convolutional network for multi-label remote sensing image retrieval. International Journal of Intelligent Engineering System, 13(5), 539–548.
Article Google Scholar
Moustafa, M. S., & Sayed, S. A. (2021). Satellite imagery super-resolution using squeeze-and-excitation-based GAN. International Journal of Aeronautical and Space Sciences, 22(6), 1481–1492.
Article Google Scholar
Ozigis, M. S., Kaduk, J. D., & Jarvis, C. H. (2019). Mapping terrestrial oil spill impact using machine learning random forest and Landsat 8 OLI imagery: A case site within the Niger Delta region of Nigeria. Environmental Science and Pollution Research, 26(4), 3621–3635.
Article Google Scholar
Pihur, V., Datta, S., & Datta, S. (2007). Weighted rank aggregation of cluster validation measures: A monte carlo cross-entropy approach. Bioinformatics, 23(13), 1607–1615.
Article Google Scholar
Ronneberger, O., Fischer, P., & Brox, T., (2015). U-net: Convolutional networks for biomedical image segmentation. Paper Presented at the International Conference on Medical Image Computing And Computer-Assisted Intervention.
Rousso, R., Katz, N., Sharon, G., Glizerin, Y., Kosman, E., & Shuster, A. (2022). Automatic recognition of oil spills using neural networks and classic image processing. Water, 14(7), 1127.
Article Google Scholar
Schubert, A., Small, D., Miranda, N., Geudtner, D., & Meier, E. (2015). Sentinel-1A product geolocation accuracy: Commissioning phase results. Remote Sensing, 7(7), 9431–9449.
Article Google Scholar
Shaban, M., Salim, R., Khalifeh, H. A., Khelifi, A., Shalaby, A., El-Mashad, S., Mahmoud, A., Ghazal, M., & El-Baz, A. (2021). A deep-learning framework for the detection of oil spills from SAR data. Sensors, 21(7), 2351.
Article Google Scholar
Sharma, V., & Mir, R. N. (2020). A comprehensive and systematic look up into deep learning based object detection techniques: A review. Computer Science Review, 38, 100301.
Article Google Scholar
Singh, H., & Singh, J. (2018). Real-time eye blink and wink detection for object selection in HCI systems. Journal on Multimodal User Interfaces, 12(1), 55–65.
Article Google Scholar
Song, D., Ding, Y., Li, X., Zhang, B., & Mingyu, Xu. (2017). Ocean oil spill classification with RADARSAT-2 SAR based on an optimized wavelet neural network. Remote Sensing, 9(8), 799.
Article Google Scholar
Song, D., Zhen, Z., Wang, B., Li, X., Gao, Le., Wang, N., Xie, T., & Zhang, T. (2020). A novel marine oil spillage identification scheme based on convolution neural network feature extraction from fully polarimetric SAR imagery. IEEE Access, 8, 59801–59820.
Article Google Scholar
Xiong, Y., & Zhou, H. (2019). Oil spills identification in SAR image based on convolutional neural network. Paper Presented at the 2019 14th International Conference on Computer Science & Education (ICCSE).
Yan, Z., Chong, J., Zhao, Y., Sun, K., Wang, Y., & Li, Y. (2019). "Multifeature fusion neural network for oceanic phenomena detection in SAR images. Sensors (Basel), 20(1), 210. https://doi.org/10.3390/s20010210
Article Google Scholar
Yekeen, S. T., & Balogun, A. L. (2020). Automated marine oil spill detection using deep learning instance segmentation model. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences 43.
Yi-de, M., Qing, L., & Zhi-Bai, Q. (2004). Automated image segmentation using improved PCNN model based on cross-entropy. Paper presented at the Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004.
Yu, H., Garrod, O., Jack, R., & Schyns, P. (2015). A framework for automatic and perceptually valid facial expression generation. Multimedia Tools and Applications, 74(21), 9427–9447.
Article Google Scholar
Zeng, K., & Wang, Y. (2020). A deep convolutional neural network for oil spill detection from spaceborne SAR images. Remote Sensing, 12(6), 1015.
Article Google Scholar
Zhang, K., Zhong, G., Dong, J., Wang, S., & Wang, Y. (2019). Stock market prediction based on generative adversarial network. Procedia Computer Science, 147, 400–406.
Article Google Scholar
Zhang, X., Jin, J., Lan, Z., Li, C., Fan, M., Wang, Y., Xin, Yu., & Zhang, Y. (2020). ICENET: A semantic segmentation deep network for river ice by fusing positional and channel-wise attentive features. Remote Sensing, 12(2), 221.
Article Google Scholar
Zhang, Y., Yu Li, X., Liang, S., & Tsou, J. (2017). Comparison of oil spill classifications using fully and compact polarimetric SAR images. Applied Sciences, 7(2), 193.
Article Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X. and Jia, J. (2017). Pyramid scene parsing network. Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Zhu, L., Zhan, S., & Zhang, H. (2019). Stacked U-shape networks with channel-wise attention for image super-resolution. Neurocomputing, 345, 58–66.
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Authority for Remote Sensing and Space Science, Cairo, Egypt
Amira S. Mahmoud & Sayed A. Mohamed
Faculty of Computers and Artificial Intelligence, Cairo University, Giza, Egypt
Reda A. El-Khoriby, Hisham M. AbdelSalam & Ihab A. El-Khodary

Authors

Amira S. Mahmoud
View author publications
You can also search for this author in PubMed Google Scholar
Sayed A. Mohamed
View author publications
You can also search for this author in PubMed Google Scholar
Reda A. El-Khoriby
View author publications
You can also search for this author in PubMed Google Scholar
Hisham M. AbdelSalam
View author publications
You can also search for this author in PubMed Google Scholar
Ihab A. El-Khodary
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amira S. Mahmoud.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.

About this article

Cite this article

Mahmoud, A.S., Mohamed, S.A., El-Khoriby, R.A. et al. Oil Spill Identification based on Dual Attention UNet Model Using Synthetic Aperture Radar Images. J Indian Soc Remote Sens 51, 121–133 (2023). https://doi.org/10.1007/s12524-022-01624-6

Download citation

Received: 27 April 2022
Accepted: 21 October 2022
Published: 20 November 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s12524-022-01624-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Oil Spill Identification based on Dual Attention UNet Model Using Synthetic Aperture Radar Images

Abstract

Similar content being viewed by others

TAFDet: A Task Awareness Focal Detector for Ship Detection in SAR Images

Triple-strip attention mechanism-based natural disaster images classification and segmentation

Oil Spill Discrimination of SAR Satellite Images Using Deep Learning Based Semantic Segmentation

Introduction

Related Work