1 Introduction

Mesoscale eddies are circular currents of water bodies with spatial scales from tens to hundreds of kilometers and temporal scales from days to years [7]. Mesoscale eddies play a significant role in the transport of momentum, mass, heat, nutrients, salt, and other seawater chemical elements across the ocean basins, effectively impacting the global ocean circulation, large-scale water distribution, air-sea coupling, and biological activities [2, 3, 7, 8, 13, 18]. Mesoscale eddies can be generally classified as either cyclonic eddies (CEs) if they rotate counterclockwise (in the Northern Hemisphere) or anticyclonic eddies (AEs) otherwise. CEs (AEs) drive local upwelling (downwelling), leading to negative (positive) sea surface height (SSH) anomalies and sea surface temperature (SST) anomalies. The changes in SSH, SST, chlorophyll concentration (CHL), and roughness caused by oceanic eddies can be recorded by altimeter, infrared, ocean color, and synthetic aperture radar (SAR) remote sensing, respectively. Accurate automatic eddy detection is crucial for monitoring the dynamics of mesoscale eddies on physical properties, transport, circulation, evolution, decay, and their impact on other ocean processes. Oceanic eddy detection based on a variety of remote sensing data has been widely studied.

Automatic eddy identification algorithms that developed based on altimeter SSH data can be divided into three categories: the physical-parameter-based method that includes the Okubo-Weiss parameter method [6, 27], the winding angle method [5, 46], and the 2D wavelet method [10]; the flow-direction-based method [39, 48]; and the SSH-based method [7, 18, 38]. Another modern method that is based on the instantaneous Lagrangian flow geometry [1, 22,23,24,25, 40] is proposed to identify eddies in turbulent flows. Several eddy detection methods were developed based on satellite SST data. E.g., edge detection method [29], neural network-based method [4], SST contour-based method [17], and velocity-geometry method [12], etc. Compared to satellite SSH data, CHL and SAR images have a high spatial resolution, which makes them effective sources for gaining more comprehensive and detailed information on mesoscale eddies in the oceans [7, 14, 15, 36]. However, eddy detection based on CHL and SAR images is still in the stage of case study due to the low space-time coverage. In conclusion, existing eddy detection algorithms can detect major circular structures of mesoscale eddies, but more work is still to be done. On the one hand, eddy detection based on different remote sensing data has its own advantages and disadvantages. For instance, eddies may temporarily ‘disappear’ or cannot be detected due to noise and sampling errors of the altimeter SSH data, while eddy detection using SST is prone to false positives because many other ocean phenomena may impact SST. On the other hand, with the accumulation of remote sensing data, some algorithms lack computational efficiency due to contour iterations [49] or complex calculation processes [40].

Recently, deep learning (DL) [33] technology has exhibited state-of-the-art performance in mining the complicated rules hidden in multi-source ocean remote sensing images [26, 35, 52]. Moreover, in comparison with traditional statistical and machine learning methods, DL technology features a strong ability to learn and model complex relationships [28, 30, 43, 47, 51]. Therefore, it is natural to propose using the DL-based model to detect mesoscale eddies based on remote sensing images. Lguensat et al. [34] developed "EddyNet" that based on the encoder-decoder network U-Net to identify oceanic eddies in the southwest Atlantic. Franz et al. [20] also used the U-Net to detect and track oceanic eddies in Australia and the East Australia current regions. Du et al. [15] developed "DeepEddy" based on PCANet and spatial pyramid pooling to detect oceanic eddies based on SAR images. Xu et al. [50] applied the pyramid scene parsing network to detect eddies in the North Pacific Subtropical Countercurrent region. These regional studies proved that the DL-based model performed well in detecting mesoscale eddies in territorial seas. The DL-based model performance on the global mesoscale eddy detection remained unverified. Moreover, these works use one type of remote sensing data as input to detect mesoscale eddies.

In order to solve the above problems, we propose a DL-based global eddy detection model based on the fusion of SSH and SST data in this study. The remainder of the study is organized as follows. Section 2 firstly illustrates a DL-based model to identify global mesoscale eddies based on satellite SSH data. Furthermore, Sect.  3 shows a multi-model DL-based eddy detection model developed based on the fusion of SST and SSH data. Section 4 shows the characterization of global mesoscale eddies detected by the multi-model DL-based model. Finally, Sect.  5 summarizes the conclusions of our investigation.

2 DL–based Eddy Detection Model Based on SSHA Data

2.1 Data

The SSHA product is produced by Ssalto/Duacs and distributed by the Archiving, Validation, and Interpretation of Satellite Oceanographic (AVISO) and is available daily on \(0.25\,^{\circ }\) spatial resolution. The product is merged from all available altimeter missions, including TOPEX/Poseidon (TP), Jason-1 &2, European Remote-Sensing Satellite (ERS)-1 &2, Environmental Satellite (ENVISAT), Geosat Follow On (GFO), Cryosat-2, Saral/AltiKa, and Haiyang-2A, and covers the period from 1993 to the present. Since resolving oceanic mesoscale variability requires a minimum of three altimeter missions [32, 41, 42], only the period from 2000 onward meets the criterion.

2.2 Method

The DL-based eddy detection model is developed based on the U-Net architecture consisting of ResNet blocks, hereafter Res-UNet. Although developed initially for semantic segmentation of biomedical images [38], U-Net [19, 45] achieves successful applications in many fields. Fig. 1 shows the framework of the U-Net, which is consisted of the encoder-decoder module, bottleneck module, and concatenation module. The encoder module extracts information at different resolutions. The output module contains a convolutional layer and activation layer to yield class confidences at each pixel.

Fig. 1
figure 1

Res-UNet based eddy detection model

The ResNet block is designed to deepen the network while alleviating the problem of network degradation. The input to the ResNet block, \(x_{r}\), is processed in two ways. A 3\(\times \)3 convolution is used to obtain a direct linear mapping result; i.e., \(w_{r}*x_{r}\), where \(w_{r}\) denotes the convolutional filter. Meanwhile, \(x_{r}\) is subjected to the following processes twice in sequence: batch normalization (BN), a rectified linear unit (ReLU), and Conv2D. The ReLU layer is used to increase the nonlinearity. By adding the direct and residual mapping, the ResNet block combines deep-learning and shallow-learning features, meaning that it can extract more valuable information. The original information is maintained and passed by the process of linear mapping with a 3\(\times \)3 convolution, which reduces the possibility of degradation.

2.3 Experiment and Performance

The training and validation datasets of mesoscale eddies are generated automatically by using the SSH-based method [37], which is similar to the eddy identification method proposed by Chelton, et al. [7]. Mesoscale eddies from 2000-2013 and 2014-2015 are used as the training dataset and validation dataset. There are 5114 training samples and 730 testing samples. Pixels in each sample are labeled as ‘1’, ‘-1’, and ‘0’ inside anticyclonic eddies (AEs), cyclonic eddies (CEs), and background regions. The Res-Unet model is trained on an Nvidia GeForce RTX 2070 GPU card using ADAM optimizer [31] and mini-batches of 16 maps. An early-stopping strategy is used to stop the learning process when the validation dataset loss stops improving in five consecutive epochs. The implementation of our model is realized in Python. The Python interfaces are based on Keras framework [9] with TensorFlow backend. The dice loss function, which is widely used in segmentation problems, is the cost function. Given the predicted segmentation P and the ground truth region G, the dice coefficient is calculated as:

$$\begin{aligned} Dicecoef(P,\ G) = \frac{2|P \cap G|}{|P| + |G|} \end{aligned}$$
(1)

where |.| is the sum of elements in the area. A good segmentation result is explained by a dice coefficient that is close to 1. By contrast, a low dice coefficient (near 0) indicates poor segmentation performance. A differentiable version of the above metric must be used to train deep neural networks. A soft Dice Coefficient was adopted in this work, and the output of the softmax layer was directly used to maximize loss calculations. The coefficient is given as:

$$\begin{aligned} softDicecoef(P,\ G) = \frac{\sum _{}^{}p_{i} \times g_{i}}{\sum _{}^{}p_{i} + {\sum _{}^{}g}_{i}} \end{aligned}$$
(2)

\({p_{i}}\) is the output of the softmax layer 1 for the correct class and otherwise set as 0. Finally, the Loss is calculated as:

$$\begin{aligned} Loss = 1 - softDicecoef(P,\ G) \end{aligned}$$
(3)

The loss and accuracy of the Res-UNet model were about 14% and 94% when training using the ground truth dataset in the South China Sea (SCS) (Fig. 2).

Fig. 2
figure 2

Mesoscale eddies detected by SSH-based method and Res-UNet in the SCS on January 1 2019

Therefore, the Res-UNet model is accurate and reliable enough to obtain mesoscale eddies in the global ocean. The global SSH and SST maps were firstly partitioned into several regional maps of 80\(\times \)60 pixels, respectively. Then, applying the Res-UNet model to SSHA maps in the same space-time until all the regions have been detected. Finally, all the regions’ eddies were seamlessly merged to obtain a global eddy map. Figure 3 shows the mesoscale eddies identified by the Res-UNet model on January 1, 2019. There are 3314 (2963 ground truth) AEs and 3407 (3056 ground truth) CEs in the global ocean. Compared to the SSH-based method, the accuracy of the Res-UNet based global eddy detection method is 93.79%, and the mean IoU is 88.86%. Figure 3 clearly shows that the Res-UNet model identified many more small-scale eddies. Besides, it takes less than 1 minute for the Res-UNet model costs to identify eddies in the global ocean, while the SSH-based method costs more than 16 hours [37]. In conclusion, the Res-UNet model can identify many more small-scale eddies and significantly improve computational efficiency.

Fig. 3
figure 3

Mesoscale eddies detected by the Res-UNet model in the global ocean on January 1, 2019

Fig. 4
figure 4

An Argo float (red line) is captured by an AE (blue line) that detected by the Dense-UNet model in the KE region and rotated with the AE on a May 19, 2014, b June 18, 2014, c July 18, 2014, and d September 16, 2014 (the color denotes SSHA)

Argo floats are associated with short repeating cycles, and they can observe mesoscale eddies in the global ocean. When trapped in an eddy, they show either a cyclonic or an anticyclonic trajectory. Therefore, the trajectory data of Argo floats are utilized to verify the accuracy of the Res-UNet model. In this chapter, the Argo float (2901556) is used to validate the results of the Res-UNet based eddy detection model. The Argo float was trapped in the AE and moved as a clockwise loop. Such a result is consistent with the concept that AEs rotate clockwise in the Northern Hemisphere (Fig. 4).

3 DL–based Eddy Detection Model Based on SSHA and SST Data

In order to solve the problem that eddies may temporarily ‘disappear’ or cannot be detected due to noise and sampling errors of the SSHA data, SST data that can finely delineate the eddy structure are added to the model, to detect mesoscale eddies more accurately.

3.1 Data

The SST dataset is the NOAA Optimum Interpolation (OI) SST product from Reynolds, et al. [44] on daily and \(0.25\,^{\circ }\) resolution. The OISST dataset is constructed from infrared satellite observations of the Advanced Very High Resolution Radiometer (AVHRR) with supplemental information provided by in situ observations and proxy SSTs computed from sea ice concentrations. Error fields were provided, showing an accuracy of about \(0.1\,^{\circ }\)C on daily basis. The OISST dataset is available from 1981 onward.

3.2 Method

The Dense-UNet model is comprised of a data fusion module and a feature extraction module (Fig. 5). Considering the complex nonlinear relationship between SST and SSHA within eddies, the layer-level fusion strategy is used to fusion SSH and SST data before the feature extraction. The layer-level fusion network can effectively integrate and fully leverage multi-modal images. Therefore, the data fusion model was developed based on the hyper-dense connectivity network [11] to integrate and fully leverage fused SSHA and SST images effectively. Satellite SST and SSHA data were imported into two streams, respectively. To better model relationships between SST and SSHA, dense connections, that use linear operations where every input is connected to every output by weight, were introduced into the model. Dense connections can relieve the vanishing gradient of networks, and reduce the parameters of deep networks [11]. Let \(x_{l}^{1}\) and \(x_{l}^{2}\) denote the outputs of the \(l^{\textrm{th}}\) layer in SST and SSHA streams and \(H_{l}\ \)is a mapping function composed of a convolution layer followed by a batch normalization and a ReLU activation function. The output of the \(l^{\textrm{th}}\) layer in a given stream s can then be defined as:

$$\begin{aligned} x_{l}^{s} = H_{l}^{s}\left( \left[x_{l - 1}^{1},\ x_{l - 1}^{2},x_{l - 2}^{1},x_{l - 2}^{2},\cdots ,x_{0}^{1},x_{0}^{2} \right]\right) \end{aligned}$$
(4)

Then, the fusion data\(\ x_{l}^{s}\) are used as input of the U-Net to detect mesoscale eddies.

Fig. 5
figure 5

Dense-UNet architecture based on the fusion of SSHA and SST data

3.3 Experiment and Performance

The training and validation datasets of mesoscale eddies are generated automatically using the SSH-based method [22]. Mesoscale eddies during 2000-2013 are used as the training dataset, and mesoscale eddies during 2014-2015 are used as the validation dataset. There are 5114 training samples and 730 testing samples. Pixels in each sample are labeled as ‘1’, ‘-1’, and ‘0’ inside AEs, CEs, and background regions. To evaluate the performance of the Dense-UNet model, we identify mesoscale eddies in the Kuroshio Extension (KE) and the SCS. The dice loss function is used as the cost function. As shown in Table 1, the loss based on the SSHA is larger than that based on the fusion of SSHA and SST. On the contrary, the accuracy based on the SSHA is lower than that based on the fusion of SSHA and SST.

Table 1 The loss and accuracy of the Dense-UNet model of different testing datasets in different ocean regions

The Dense-UNet model can be further verified by a case study of a CE in the KE (Fig. 6). On November 22 and 23, 2013, the CE identified by SSHA split into two CEs, while the CE identified by the fusion of SST and SSHA was consistent with the negative area of SSHA. From December 28, 2013 to January 13, 2014, the CE identified by SSHA did not cover the negative area of SSHA, while the eddy boundary identified by the fusion of SST and SSHA completely covered the negative area of SSHA. Therefore, it can be indicated that the fusion of SSHA and SST data enhances the accuracy and robustness of eddy detection and can also ensure eddy tracking’s continuity and accuracy.

Fig. 6
figure 6

Variations of a CE in the KE at different time during the evolution. The black line represents the eddy identified by SSHA, while the purple line represents the eddy identified by the fusion of SST and SSHA

In this section, we propose the Dense-UNet method to identify oceanic mesoscale eddies. Compared to the methods that identify eddies based on one kind of remote sensing images, Dense-UNet detect eddies based on the fusion of SSHA and SST data. Using the Dense-UNet model, we perform a comparison experiment using SSHA data and fusion data in the SCS and KE regions, respectively. As a result, the Dense-UNet model achieves impressive detection performance based on the fusion data. The model not only improves eddy detection accuracy and efficiency but also gives a novel viewpoint on exploring the relationships between marine environmental variables and mesoscale eddies.

4 Characterization Analysis of Mesoscale Eddies in the Global Ocean

4.1 Spatiotemporal Distributions of Eddies in the Global Ocean

Based on the Dense-UNet model, mesoscale eddies were identified based on 23-year satellite SSHA and SST data in the global ocean during 1993–2015. In this study, the research focused on eddies with amplitudes greater than 2 cm and sea surface radii larger than 35 km, which was based on consideration of the resolution and precision of the SSH product [16]. Besides, we only consider eddies located in areas where water depths are greater than 200 m to minimize the impacts of data errors near the coastal shallow water region. An average of 4,100 mesoscale eddies were identified daily in the global ocean during the period 1993–2015. The frequency of eddies for a given geographic resolution (\(0.25\,^{\circ }\) latitude by \(0.25\,^{\circ }\) longitude) was defined as an F-number for simplicity:

$$\begin{aligned} F(\%) = \frac{d_{eddy}}{d_{total}} \end{aligned}$$
(5)

where \({d}_{eddy}\) means the days that mesoscale eddies appeared, and \({d}_{total}\) represents the total number of observation days. In other words, high F-numbers imply a high intensity of eddy activity and vice versa. The seasonal variability for AEs and CEs in the global ocean is similar (Fig. 7a-b). In the Southern Hemisphere (SH), eddy activity is weak in the austral summer (December–February, DJF) and fall (March–May, MAM), but intensive during the austral spring (September–November, SON) and winter (June–August, JJA), and vice versa in the Northern Hemisphere (NH).

Figure 7c-d shows the spatial distribution of mesoscale eddies in the global ocean. Mesoscale eddies with lower FF-number were distributed in tropical waters. On the contrary, mesoscale eddies with higher F-number were widely distributed in the middle latitude regions, including the Kuroshio Extension region, the Agulhas Current, the Gulf Stream, the Agulhas Return Current, the East Australia Current, and the Antarctic Circumpolar Current, etc. Besides, CE activities are more intensive than AEs in the Western Boundary Current regions. In general, the spatial distribution of global eddies detected in this study has good consistent with previous literature [7, 18, 49].

Fig. 7
figure 7

Spatiotemporal distribution of the F-number of mesoscale eddies in the global ocean from 1993-2015. The graphs and maps show meridional variation and spatial distribution of the F-number of AEs (a, c), CEs (b, d). MAM represents March–May, DJF represents December–February, JJA represents June–August, and SON represents September–November. The image resolution is \(0.25\,^{\circ }\) by \(0.25\,^{\circ }\)

4.2 Long-term Variations in Derived Eddy Parameters

The long-term variations in annual mean eddy properties (eddy number, radius, amplitude, and rotational speed) are shown separately for AEs and CEs in the NH and the SH. The eddy number is the annual mean eddy census per day, and the percentage represents the ratio of the annual mean abnormal eddy census per day to the total number of eddies. The eddy radius is the distance from its center to the outermost SSH contour with the maximum average geostrophic speed (U). \(U = \sqrt{u^{2} + v^{2}}\), where u and v are the zonal and meridional components of the geostrophic velocity anomaly, which can be computed from the SSH gradients:

$$\begin{aligned} u = - \frac{g}{f}\frac{\partial SSH}{\partial y} \end{aligned}$$
(6)
$$\begin{aligned} v = \frac{g}{f}\frac{\partial SSH}{\partial x} \end{aligned}$$
(7)

where g is the acceleration due to gravity; \(\partial x\) and\(\ \partial y\) are the eastward and northward distances, respectively; and \(\text {f\ }\)is the Coriolis parameter. Eddy kinetic energy is given as (EKE)\(= \frac{1}{2}(u^{2} + v^{2})\). The amplitude is the difference in SSHA between the eddy core and boundary. The rotational speed is the maximum of the average geostrophic speed around all of the eddy’s closed SSHA contours.

About 2100 CEs and 2000 AEs formed per day as detected by the Dense-UNet eddy detection model for each global SSHA map. This is close to the result of Faghmous et al. [18], which identifies approximately 2300 CEs and 2300 AEs for each daily SSHA snapshot. The slight difference in eddy number between the two eddy datasets is possible because there is no limit to the amplitude of eddies in Faghmous et al. [18]. There were no significant decreasing and increasing trends in the annual mean eddy parameters for both AEs and CEs during the 1993–2015 period, and the annual mean eddy parameters for eddies in the NH and the SH are different (Fig. 8). Eddy numbers in the SH are twice as much as that in the NH, which is consistent with the result in Fig. 7a-b. The annual mean radius for AEs are slightly larger than that of CEs, which is about 87.0 km and 86.0 km, respectively. The annual mean amplitude of the CEs is larger than that of the AEs in both hemispheres, and annual mean eddy amplitude in the SH is larger than that in the NH. The annual mean eddy amplitude of AEs and CEs in the NH (SH) is 6.14 (6.7) cm and 6.29 (7.37) cm, respectively. The difference between AEs and CEs on amplitude is expected from the gradient wind effect of centrifugal force that pushes fluid outward in rotating eddies [21], thus intensifying the low pressure at the centers of CEs and weakening the high pressure at the centers of AEs [7].

Fig. 8
figure 8

Variations in annual mean parameters of mesoscale eddies in the global ocean from 1993 to 2015. Eddy number a, eddy radius b, eddy amplitude c, rotational speed d, and EKE e. \(\mu \) (dotted line) and \(\sigma \) (shading) are mean values and one standard deviation of the annual mean eddy parameters

Similarly, the annual mean eddy rotational speed and EKE of CEs are also larger than that of AEs in their respective hemispheres since they were derived from eddy amplitude. However, the annual mean rotational speed and EKE of eddies in the NH are larger than in the SH. The annual mean eddy rotational speed of AEs and CEs in the NH (SH) is 19.73 (18.19) cm/s and 20.91 (19.84) cm/s, respectively. The annual mean EKE of AEs and CEs in the NH (SH) is 152.95 (113.41) cm/s and 190.00 (145.08) cm/s, respectively.

5 Conclusions

This chapter elaborated on how to apply deep learning technology to global mesoscale eddy detection. We first developed a deep learning-based eddy detection model based on SSHA data. The model consists of U-Net and ResNet blocks, called Res-UNet. The Res-UNet was applied to detect mesoscale eddies in the global ocean. Argo floats data are used to verify the Res-UNet model. The Argo float was trapped in the AE and moved as a clockwise loop. Such a result is consistent with the concept that AEs rotate clockwise in the Northern Hemisphere. Compared to the traditional eddy detection methods, the Res-UNet eddy detection model can accurately identify mesoscale eddies and significantly improve computational efficiency. Such a result proves that deep learning technology has strong learning abilities and can better use datasets for feature extraction.

Considering that eddies may temporarily ‘disappear’ or cannot be detected due to noise and sampling errors of the SSHA data, the study further develops a multi-modal deep learning model—Dense-UNet model to detect mesoscale eddies based on the fusion of SSHA and SST data. The Dense-UNet model extracts SSHA information for determining eddy locations and withdraws SST information to supplement and confirm eddy features embodied in SSHA data. The results show that the fusion of SSHA and SST data enhances the accuracy and robustness of eddy detection and can also ensure eddy tracking’s continuity and accuracy. Based on the Dense-UNet eddy detection model, mesoscale eddies are detected based on satellite SSHA and SST data in the global ocean from 1993–2015. The analysis of the spatiotemporal distribution of the 23-year global eddy dataset revealed that eddies were concentrated along western boundary currents. Mesoscale eddies are active in winter in the North Hemisphere and vice versa in the Southern Hemisphere. The spatiotemporal distribution of eddies detected by the Dense-UNet model is in good agreement with previous studies, thus further validating the model’s accuracy.

The long-term variations in annual mean eddy properties (eddy number, radius, amplitude, and rotational speed) are analyzed separately for AEs and CEs in the Northern and the Southern Hemisphere. There were no significant decreasing and increasing trends in the annual mean eddy parameters for both AEs and CEs during the 1993–2015 period, but the annual mean eddy parameters for eddies in the Northern Hemisphere and the Southern Hemisphere are different. Eddy numbers in the Southern Hemisphere are twice as much as that in the Northern Hemisphere. The annual mean radius for AEs is slightly larger than that of CEs in both hemispheres. The annual mean amplitude of the CEs is larger than that of the AEs in both hemispheres, and the annual mean eddy amplitude in the Southern Hemisphere is larger than that in the Northern Hemisphere. The annual mean eddy rotational speed and EKE of CEs are also larger than AEs in their respective hemispheres. However, the annual mean rotational speed and EKE of eddies in the Northern Hemisphere are larger than that in the Southern Hemisphere. The difference in eddy parameters between the two hemispheres is caused by the different generation mechanisms of mesoscale eddies, which deserves further study. In conclusion, the study extends the usage of satellite remote sensing big data, enriches the application of deep learning technology in oceanography, and promotes multidisciplinary research in this aspect.