Forecasting Tropical Instability Waves Based on Artificial Intelligence

Zheng, Gang; Li, Xiaofeng; Zhang, Ronghua; Liu, Bin

doi:10.1007/978-981-19-6375-9_2

Gang Zheng³,
Xiaofeng Li⁴,
Ronghua Zhang^4,5 &
…
Bin Liu^3,6,7

2913 Accesses

Abstract

The trend of quickly increasing volumes of satellite remote sensing big data and the successful application of deep learning (DL) technology to many research fields inspire us to develop a deep neural network-based DL ocean forecasting model that is driven only by the time series of gridded sea surface temperature (SST) data. The model forecasted the SST pattern variations in the eastern equatorial Pacific Ocean, where a well-known prevailing oceanic phenomenon, tropical instability waves (TIWs) characterized by cusp-shaped waves, propagates westward between 5 $^\circ $S and 5 $^\circ $N. The model was trained and tested in two non-overlapping periods of four (2006–2009) and nine years (2010–2019). The model can make an 18 km $\times $ 18 km gridded five-day SST forecast with a root mean square error of 0.29 $^\circ $C. The model also successfully captures the spatial-temporal propagation characteristics and the seasonal cycle and interannual variability of the TIW modulated by the El Niño-Southern oscillation. Thus, the data-driven DL technology could be a promising way to forecast complicated oceanic phenomena.

You have full access to this open access chapter, Download chapter PDF

Deep Neural Networks to Predict Sub-surface Ocean Temperatures from Satellite-Derived Surface Ocean Parameters

Deep learning-based forecasting of sea surface temperature in the interim future: application over the Aegean, Ionian, and Cretan Seas (NE Mediterranean Sea)

Article 13 January 2024

Prediction of sea surface temperatures using deep learning neural networks

Article 30 July 2020

1 Sea Surface Temperature and Tropical Instability Waves

With the development of earth observation satellites and various active and passive sensors, massive ocean data have been acquired. For instance, the cumulative satellite data archive volume at the National Oceanic and Atmospheric Administration’s National Centers for Environmental Information reached ~7.5 petabytes in 2016. The projected volume by 2030 is ~50 petabytes [32]. Many oceanic gridded products (e.g., sea surface temperature (SST), sea surface winds, and sea surface height) have been generated from such deluges of satellite data. These products provide an unprecedented golden opportunity for in-depth research and demonstrate the urgent need to develop effective methods to explore time-series data. SST can be measured from space and has the longest history among satellite-derived oceanic products widely used to reveal the evolution of various important oceanic phenomena such as El Niño, western boundary current, and tropical instability wave (TIW) [18]. Thus, SST is a critical parameter in understating physical oceanography, biological oceanography, and atmosphere-ocean interaction; it is also a key input parameter for climate and weather modeling. The models in traditional statistical analysis have relatively limited complexity. This could make the models not work well, when used to model the oceanic phenomena that are complicated by nature.

Recently, another new research and application front that utilizes available tremendous data using deep learning (DL) technology has emerged. With DL, substantially more complex models can be built to mine rules deeply hidden in SST data. DL is a subset of machine learning that teaches computers to learn and make decisions or predictions based on input data. The deep neural network (DNN) technique is one of the most popular and powerful DL techniques, achieving successes in computer vision and speech recognition [15, 17]. A DNN is a multilayer neural network (NN). In most network layers of a DNN, input values are weighted, combined, and then transformed by an activation function to incorporate nonlinearity into the network. The output values of a network layer are linked to the next layer as input. All weights of a DNN are iteratively optimized by combining error backpropagation and gradient-based optimization to make the DNN suitable for finding the underlying relationship among its inputs and outputs. Such a multilayer structure allows the DNN to learn data features with multiple abstraction levels, which is impossible to imagine by the human brain [15]. Convolutional layers, named for their mathematical form, are a core type of network layer widely used in DNN models. In a convolutional layer, the output value at a specific site is calculated by weighting and combining the nearby sites’ input values. Each output site shares the same weights. Thus, a convolutional layer has fewer weights to be optimized than a traditional fully connected layer that uses independent weights to connect all input and output sites. As a result, using the convolutional layer is particularly efficient in processing multi-dimensional data. Therefore, compared with traditional statistical models, DNN-based DL models can be much more complex and thus, after trained by a large quantity of sample data, can more efficiently learn the inherent characteristics behind them. Recently, DL applications in the prediction of future images in videos have drawn extensive attention in the field of computer vision [24, 35]. Ocean SST forecasting is similar to image prediction in videos, where future SST maps are forecasted based on the previous maps using a DL model. Because of the abovementioned similarities, we believe DL technology will help us to model oceanic phenomena in a different and promising way that is driven by ever-increasing big ocean data, although DL applications in oceanography and other geosciences just begin in recent years [31]. Therefore, using the large accumulated amount and long time series of satellite SST data, we can build a pure data-driven SST forecasting model that capture the spatial-temporal variations of a complicated yet important oceanic phenomenon, TIW, which has effects on transport of heat, mass and momentum in the ocean, air-sea and biophysical interactions, climate change, etc. As an internally generated ocean variability with time scales of approximately 15-40 days, TIWs produce large perturbations to physical and biological fields in the ocean, including SST. Furthermore, TIW-produced SST perturbations induce almost instantaneous atmospheric surface wind responses, forming TIW-scale interactions between the atmosphere and ocean. Although TIWs are dominantly controlled by the background ocean state, TIW evolution and predictability are affected by air-sea coupling at TIW scales. TIW forecasting is a challenging task because the spatial-temporal variation of TIW is significant, with large shape distortions and deformations and seasonal and interannual variability caused by the El Niño-Southern oscillation. Both high-resolution grids of the space domain discrezation and realistic parameterizations of the relevant physical processes are required, when we use numerically model TIWs. All these lead to substantial difficulties in realistic simulation of TIW-related oceanic and atmospheric responses and the coupled air-sea interactions. Dynamical equation-based numerical modeling for TIWs requires not only high spatial resolution but also realistic parameterizations of the relevant physical processes. As a result, substantial difficulties exist in realistically simulating TIW-related atmospheric responses and the coupled air-sea interactions. Therefore, the data-driven model was applied to the SST field in the eastern equatorial Pacific Ocean to show that the TIW propagation can be forecasted by the data-driven model.

Satellite-derived SSTs have long been assimilated into numerical models to improve their forecasts. Recently, the NN-based strategy was proposed to perform a similar role as data assimilation. For example, in [27], a NN model is used to find the bias correction term in a numerical SST forecasting model. Compared with a numerical model, a data-driven forecasting model is much simpler and computationally efficient. The forecast made by a data-driven model relies only on prior data of minimal physical parameters or even one parameter. As another example, an SST pattern time series can be expanded as the sum of products of time-dependent principal component scales and corresponding space-dependent eigenvectors following empirical orthogonal function (EOF) analysis. Thus, the forecast of the SSTs at grids can be approximately reduced to the forecasts of several SST leading principal components [40]. Recently, NN models were developed to directly forecast SSTs without EOF approximations, including both site-specific and -independent models. A site-specific model considers the site difference, so makes SST forecasts with different NN models at different sites [26]. However, as each site needs building a NN model, the computation coat is high in the NN-training phase of a site-specific model, and sufficient NN-training samples are also required at each site. When use a site-independent model to forecast SSTs, different sites share the same SST forecast model [2, 42, 44]. This makes site-independent models more efficient. However, when forecasting a future SST at one site, these recent models only utilize the prior SST series at the very close neighboring sites. The models may have limitations over a large area because the SST patterns controlled by large-scale phenomena could be related to each other within a vast ocean area. Thus, maybe a wider SST series centering at a forecast site should be utilized to forecast the future SST.

In the following section, we introduce a multi-scale scheme DNN with four stacked composite layers for SST forecasting in the eastern equatorial Pacific Ocean, which overcomes the shortcomings of previous data-driven SST forecasting models. The idea of a multi-scale scheme has achieved notable successes in the field of computer vision, e.g., DNN applications in semantic segmentation [21, 33], but has not been explored in the oceanography field. Considering the natural differences among different sites, we also build a space-dependent but time-independent bias correction map and then combine it with the multi-scale DNN to develop the final data-driven SST forecasting model, named the DL model for brevity.

The developed DL model was applied to forecast the SST pattern variations associated with the TIWs in the eastern equatorial Pacific Ocean. TIWs are an important ocean dynamic phenomenon in both the equatorial Pacific and Atlantic Oceans. They were first captured in the current meter records and infrared satellite images in the 1970s [6, 18]. One prominent characteristic of Pacific TIWs is its cusp-shaped and westward-propagating waves at both flanks of the equatorial Pacific cold tongue where the north flank has stronger signal. Previous studies have estimated the wavelength, period, and phase speed of TIWs from various data sources, and their values are typically within the ranges of 600 to 2000 km, 15 to 40 days, and 17-86 km/day [3, 4, 12, 13, 19, 28, 29, 38, 39]. Previous studies also suggested that the generation of TIWs could be the result of barotropic and baroclinic instability processes of the meridional and vertical shear among the westward South Equatorial Current, the eastward Equatorial Undercurrent, and the North Equatorial Counter Current [4, 23, 30, 34]. As a result, TIWs are inactive/active during boreal spring/fall, because the current shear is weaker/stronger at that time. Moreover, TIWs are suppressed and even indiscernible during strong El Niño years when the Pacific cold tongue and the related equatorial current shear are too weak and vice versa during La Nina years [39]. Conversely, TIWs also have feedback to the El Niño-Southern Oscillation, affecting its asymmetry and irregularity [1, 10, 11]. The physical and biological processes of TIWs are complicated. As has been widely illustrated, TIWs have a profound effect on the distribution of SST, sea surface height anomaly, chlorophyll-$\alpha $, rain, salinity, and winds in the eastern equatorial Pacific Ocean [3, 14, 28, 29, 38]. TIW induces horizontal convection and vertical mixing in the upper sea [12, 13, 20, 25]. The mixing reaches even the lower half of the thermocline, a fact that is still not well considered in most physical models [20]. TIWs affect the equatorial chlorophyll-$\alpha $ concentration by transporting nutrients to the upper ocean [7, 9, 43]. Conversely, modeling analyses indicate that chlorophyll-$\alpha $ may modulate solar radiation in the upper ocean and weaken TIWs [36, 37]. TIWs also interact with the atmosphere because of the sea surface wind modulation caused by the TIW-induced SST anomalies [21, 41, 45,46,47]. Moreover, a spatial correlation between SST and cloud patterns is observed during the TIW seasons. The clouds appearing in the warm troughs of the TIWs are usually generated by cool low-level winds crossing the SST fronts and, in turn, dampen the TIW-induced SST anomalies by reducing the incident solar radiation over the warm troughs [5]. More comprehensive physical models for TIW studies are still ongoing, and many of the above-mentioned aspects should be considered to make the models more realistic [12, 14, 20, 36, 37, 45,46,47], which is a difficult challenge. In contrast, the time series of data contain all these factors. Owing to the strong data-mining ability, a data-driven DL model can automatically learn comprehensive rules of SST spatial-temporal variations from the data, and does not depict various complex processes by using physical equations.

2 Data and Model of SST Forecasting

There are two parts in the model: a DNN and a constant map. The DNN is multi-scale, having a network structure of four stacked composite layers for different spatial resolutions. The DNN uses the SSTs from the preceding fourteen steps to estimate the SSTs at the following step. The interval between the two steps is five days. The DNN-made estimation is followed by the correction with the constant map for reducing bias. The details are given below.

2.1 Satellite Remote Sensing SST Data

The DL model was built and tested with the SST products of Remote Sensing Systems. The products were made from both microwave and infrared sensor measurements. Our studied area is a rectangular region spanning from 120 $^\circ $W to 180 $^\circ $W in longitude and from 10 $^\circ $S to 10 $^\circ $N in latitude. The products from 2006 to 2019 were collected in our study. These 9-km-grid products were averaged to the 18-km-grid SST data. The SST data were divided into two parts according to time. The first part (1st Jan 2006–31st Dec 2009) and the second part (1st Jan 2010–31st Mar 2019) were used to build and test the DL model, respectively. By considering that TIWs have about a fifteen-to-forty-days temporal scale, the time step of the DL model is set to five days. Based on the preceding thirteen and current-step SST maps, the DL model forecasts the SST map at the following future time step, the fifth step. Therefore, a sample in our study is an SST series consisting of sequent fifteen SST maps. Then, the SST series was shifted day by day to get the second, third, fourth, etc. The DL model forecasts the fifteenth-step SST map in each series based on the first-fourteen-steps SST maps. The forecasted SST map was then validated using the series’s fifteenth-step SST map. Approximately one thousand four hundred series were generated in the first part of the SST data, and three thousand four hundred series samples were generated in the second part of the SST data. It should be noted that a significant El Niño event occurred during the period of 2014–2016, which is covered by the second part of the SST data.

2.2 Architecture and Training of the DL Model

As shown in Fig. 1, the DL model is composed of a trained multi-scale DNN and a time-independent bias-correction map. The DNN is a stack of four composite layers. And each composite layer has four cascaded convolutional layers.In this region,the value of SSTs range from 16 $^\circ $C to 34 $^\circ $C, and the range was rescaled to [−1, 1]. In order to fed to the corresponding composite layers at different stack levels, a 2 $\times $ 2 average pooling operation was used to downsample the SST maps. These composite layers process the SST maps at different spatial resolutions. The lower the stack level, the higher the resolution. Except the top level,each higher resolution composite layer at a lower stack level requires the output of the composite layer at the upper stack level. And the output need to be up-sampled. The input of the DNN consists of 14 SST maps at the current step and the previous 13 steps. Considering the input SST map at the current time step is more correlative to the future SST map, the DL model also directly linked the input SST map at the current step to the last convolutional layer along with the up-sampled output of the lower resolution composite layer at the upper stack level. The rectified linear unit function has better error gradient propagation [8], so it was used as the activation for the first three convolutional layers of each composite layer. The tanh activation was used for the last convolutional layer of each composite layer except for the bottom composite layer. The tanh activation rescales the output of each composite layer to [−1, 1] that matches the input range of the higher resolution composite layer where the output is fed after the up-sampling. The activation of the last convolutional layer of the bottom composite layer is a linear function and is used to make the DNN output unbounded. The four convolutional layers of each composite layer include 8, 16, 32 and 1 channels. The kernel sizes of the four convolutional layers of the top composite layer are all 3 $\times $ 3. Those of the other composite layers are 5 $\times $ 5, 3 $\times $ 3, 3 $\times $ 3 and 5 $\times $ 5, respectively.

For a general network layer, one site in the output map is connected to multiple sites in the input map. Thus, the value at the output site is only dependent on the values at these input sites rather than the whole input map. These input sites form the receptive field of the output site. For instance, the input sites inside a receptive field of a convolutional layer are weighted and connected to the corresponding output site by the convolution kernel. The receptive field can be enlarged by using average pooling layers to down-sampling the inputs before feeding them to the subsequent layer. Then, the output can be treated with the same number of up-sampling layers to restore the resolution. SST variations in different locations may be correlated by oceanic phenomena with large scales. Considering this, we use the SST series of a wider area to forecast the SST at the area center. Therefore, the DNN is designed to be multi-scale to obtain the wider receptive field. After three down- and up-samplings among the four composite layers, the receptive field size of the whole DNN extended by about twelve times. For forecasting TIWs, this size is large enough.

The SST-map-series samples for building the DNN were divided into the training and validation datasets, according to the ratio of 3:1. The input area is set to be larger than the output (forecast) area in order to ensure that the input area covers the whole DNN receptive field. The following loss function is used to optimize the DNN:

$$\begin{aligned} Loss = \sum _{k = 1}^{K}{\sum _{(m,n) \in \text {Grid}s_{\text {output}}}^{}\left( \text {SS}T_{\text {output}}^{(k)}(m,n) - \text {SS}T_{\text {true}}^{(k)}(m,n) \right) ^{2}} \end{aligned}$$

(1)

where $\text {SS}T_{\text {true}}^{(k)}(m,n)$ is the fifteenth-step satellite SST map. k denotes the kth sample, and K is the sample number of the training or validation dataset. (m, n) denote the grid (m, n) of the output area, and Grids_output is the grid set. $\text {SS}T_{\text {output}}^{(k)}(m,n)$ is the DNN-forecasted SST. The Adam algorithm [16] was used to optimize the DNN parameters on the training dataset, and the maximum number of epochs was set to be 2500. The optimization was implemented using the CUDA technique on a NVidia Quadro M4000. The memory of the graphics card is eight GB. In order to avoid overfitting to the training dataset, the loss value on the validation dataset was also calculated during the optimization procedure. The smallest loss value (the validation dataset) was achieved at the one hundred and twenty-nineth epoch costing about one hundred and fourteen minutes. The parameter values corresponding to the smallest loss value were adopted.

Parameters in convolutional layers are the same for different sites. In addition, there is no optimizable parameter in both average pooling and up-sampling layers. Thus, the DNN is independent of the site. However, the environmental background of the study area is inhomogeneous. There is a spatial trend that the SST is overall higher in the west than in the east. This may cause evolution differences among the SST pattern in different areas. Therefore, an SST correction map is included in the DL model, which is added to the DNN-forecasted SST map to make the final forecast (Fig. 1). By using the samples during the training period, this SST correction map is generated by calculating the bias of the DNN at each grid after the optimization.

The operating efficiency of the developed DL model is very high. It only takes about 1 minute to forecast SSTs for all testing samples on an ordinary desktop computer.

3 SST Forecast of TIW Motion Using the DL Model During the Testing Period (2010/01–2019/03)

Figure 2a–c shows the satellite SST maps of the testing period, and Fig. 2d–f shows the SST forecast result by the DL model. The maps are matched closely in shape, where the most notable feature is the characteristic of TIWs that propagate westward. The characteristic is cusp-shaped and irregular deformations.

Figure 3 shows the output of the four composite layers in Fig. 1 at three continuous time steps and visualized from the first (bottom) to the fourth (top) stack level of the DNN. For the sake of clarity, the coarse-resolution results at higher levels are converted to the initial resolution using the nearest neighbor interpolation method. Then the results are rescaled to [−1, 1]. All outputs show a westward propagating signal similar to the satellite SST maps as shown in Fig. 2a–c. These maps are extracted from the DNN network during the training period(2006-2009) and show the temporal and spatial characteristics of TIW. Related parameters in the network are learned by DNN from sample data. The TIWs’ motion can be forecasted by these features.

The forecasted and satellite SST maps’ meridional averages (MAs) are calculated. The maximum detrended cross-correlation between the MAs at the current time step and the next step along the equator can estimate the westward propagation speed of the SST pattern.

During TIW Seasons, MAs calculated by SST can reflect the westward propagation signal of the SST pattern. The forecast area exists an approximately linear zonal trend of SST, which is warm in the western part and cold in the eastern part. Moreover, the trend is superimposed with the above signal. An instance of two zonal sequences of SST MAs at the longitudes of the grids of the forecast area and at two consecutive time steps is given (Fig. 4a). The red lines represent the MAs of the DL-forecasted SST map after five days(one time step), and the blue lines represent the MAs of the satellite SST map sequence. The westward propagation of the signal becomes more obvious after removing the linear zonal trend of the SST MAs(Fig. 4b). The two sequences of detrended SST MAs series’s cross-correlations can be calculated at the discrete zonal lags (Fig. 4c) , and can find the discrete lag with the maximum cross-correlation and its two neighboring discrete lags (Fig. 4d). A quadratic curve can interpolate the cross-correlation of three discrete lags. The peak lag of the interpolated curve is considered to be the exact lag of the maximum cross-correlation between two non-trending SST MA sequences(Fig. 4e).In mathematical form, this is

$$\begin{aligned} lag_{\text {exact}} = \frac{1}{2} \cdot \frac{y_{1}(la{g_{2}}^{2} - la{g_{3}}^{2}) + y_{2}(la{g_{3}}^{2} - la{g_{1}}^{2}) + y_{3}(la{g_{1}}^{2} - la{g_{2}}^{2})}{y_{1}(lag_{2} - lag_{3}) + y_{2}(lag_{3} - lag_{1}) + y_{3}(lag_{1} - lag_{2})} \end{aligned}$$

(2)

where $lag_{1}$, $lag_{2}$, and $lag_{3}$ are the three discrete lags, and, $y_{1}$, $y_{2}$, and $y_{3}$ are the corresponding cross-correlations. Finally, the propagation speed can be obtained by dividing the exact lag by the time interval.

Figure 5 shows the estimated speeds mainly ranges from 0 to 100 km/day [3, 4, 12, 13, 19, 28, 29, 38, 39]. The green solid curve represents the SST pattern propagation velocity predicted by the DL model. The red dashed curve represents the velocity estimated by the satellite/satellite SST MA pairs. The two curves are in good agreement. Both curves show very consistent TIW seasonal fluctuations.In the TIW season, TIW controls the motion of the SST pattern. Thus, the DL-forecasted SST pattern propagation velocity can be regarded as the TIW speed. Nevertheless, the SST pattern is inert, and there is no apparent westward motion in the no- or weak-TIW seasons.

The DL model can also forecast recursively. In this recursive frame, the forecasted SST, the present satellite SST, and the previous 12 satellite SSTs were used to forecast the SST at the second recursive step, and then, the two forecasted SSTs, the current satellite SST and the previous 11 satellite SSTs were utilized to forecast the SST at the third recursive step. Therefore, the DL model recursively forecasts the SST in the subsequent steps (the fourth, fifth, sixth, etc. recursive steps). Figure 6 shows an example of the recursively forecasted SST maps at the subsequent three time steps after the final time step in Fig. 2. As can be seen from the figure, the DL model can still work well and forecast the TIWs’ westward motion in general.

4 Interannual Variation in TIW Westward Propagation

The daily Niño3.4 index data were also overlaid on Fig. 5, and denoted by orange dotted curve. The data was provided by the KNMI (the Royal Netherlands Meteorological Institute) Climate Explorer. Fig. 5 shows that the DL-forecasted TIW speed values and the Niño3.4 index values are 180 degrees out of phase. There is a major El Niño event from 2014 to 2016, and the TIW speeds were almost zero for the weakening of meridional SST gradients during this time. The measurements of mooring and Argo float from 2000 to 2010 also validate this fact, in which TIW kinetic energy and occurrence probability show negative correlation with the Niño3.4 index [11]. The correlation coefficient between the Niño3.4 index values and the speed values estimated from satellite/satellite SST MA pairs is -0.38, with a P-value close to zero and a 95% confidence interval of (−0.35, −0.41). The corresponding statistic results for the DL-forecasted speeds are -0.53, with a P-value close to zero and (−0.50, −0.55).

5 Zonally Westward Propagation of TIWs

Figure 7 gives the zonal TIW westward propagation speeds at 2-degree latitude bands, which were estimated from the satellite/satellite maps and the satellite/DL-forecasted SST maps, respectively. As can be seen from the figure, the estimated speed distributions are consistent with each other and their temporal fluctuations are similar during TIW seasons. The fluctuations are also similar to the curves in Fig. 5. Furthermore, the equatorial bands have higher speeds than the higher-latitude bands. All these results are in agreement with the previous findings for the reason that TIWs at different latitudes are controlled by different dynamic mechanisms with their speeds determined by equatorial wave processes [22, 38].

6 Accuracy During the Testing Period (2010/01–2019/03)

The root mean square error (RMSE) and bias variation of the DL model over time were calculated during the testing period and are given in Fig. 8. From the figure, it can be seen that the RMSE and bias are generally stable. The RMSE fluctuates between 0.15 $^\circ $C to 0.45 $^\circ $C, while the bias fluctuates between $-0.15\,^\circ $C to 0.15 $^\circ $C. Due to the rapid change of the SST pattern, the RMSE of the DL model is larger during the TIW seasons (Fig. 8a). There are approximately 3300 samples at each grid point. The RMSE and bias at each grid were calculated, and the RMSE and bias spatial distributions of the DL model are given in Fig. 9. The RMSE of the cold tongue area is higher than other areas. This is caused by the large spatial gradient and fast temporal variation of the SST in the cold tongue area. In the study area, the global RMSE of all grids and all samples is 0.29 $^\circ $C and the bias is $-0.01\,^\circ $C.

For the recursive forecasting, the global RMSE and bias of the DL model from 5 days to 150 days after the current time step (i.e., recursive steps 1 to 30) are given in Fig. 10. It can be found that the DL model’s accuracy declines with the evolution of time. It should be noted that there will be no satellite SST in the model input after 14 recursive steps. Even so, the RMSE does not grow rapidly and is still smaller than 0.80 $^\circ $C at the 15th recursive step. Meanwhile, the magnitude of the DL model’s bias is also smaller than 0.10 $^\circ $C at the 30th recursive step.

7 Conclusions

In this chapter, a data-driven DL SST forecasting model using the DNN technique was built. The DL model accurately forecasted the spatial-temporal variation of the SST pattern with a RMSE of 0.29 $^\circ $C and the TIW’s propagations that agree well with actual satellite observations.

The DL model is different from previous models. The DL model consists of a multi-scale DNN with four stacked composite layers and a time-independent but site-dependent bias correction map. In this design, the DL model takes the spatial dependence of a site-specific forecast over a large surrounding area and the bias correction of the DNN at different sites into consideration. The DL model was tested for nine years without overlapping with the training period. The results show that the DL model effectively forecasts the SST variation associated with TIWs. The DL-forecasted TIW speed is in good agreement with that estimated from the satellite SST maps. Both of the speeds present the consistent seasonal cycle and interannual modulation, and the interannual modulation is negatively correlated with the Niño3.4 index. TIW speeds are higher in equator than other latitudes. The DL model can also forecast SSTs at future steps in a recursive manner, although the accuracy degrades with time for the loss of actual satellite SST input.

The developed model results show DNN’s great potential for marine forecasting utilizing gridded data. Compared with numerical forecasting models, DL forecast models are straightly driven by real measurements and elude the complex process, including model parameterizations and approximations, various physical equations, and a substantial computational burden. DL models are able to forecast accurately with the help of a few physical parameters’ prior information. In our case, only one SST parameter was used. Almost all of the DL model’s computational cost is spent on the iterative optimization of the weights. Emerging technologies on hardware, e.g., CUDA, can easily speed up this learning procedure. If the DNN has been trained and obtained the bias correction map, the DL model can make an efficient forecast with no iteration. Therefore, it can work very rapidly. In our case, it only takes about one minute to forecast the SST pattern of the testing period by an ordinary desktop computer. As far as DNN is a data-driven technology, whether training or using, sufficient data is always the basic requirement. Fortunately, sufficient data and DNN’s outstanding learning capability fully cater to the growing amount of marine satellite observations in the era of remote sensing big data.

References

An SI (2008) Interannual variations of the tropical ocean instability wave and ENSO. J Clim 21(15):3680–3686. https://doi.org/10.1175/2008JCLI1701.1
Article Google Scholar
Aparna SG, D’Souza S, Arjun NB (2018) Prediction of daily sea surface temperature using artificial neural networks. Int J Remote Sens 39(12):4214–4231
Article Google Scholar
Chelton DB, Wentz FJ, Gentemann CL, Szoeke RD, Schlax MG (2000) Satellite microwave SST observations of transequatorial tropical instability waves. Geophys Res Lett 27(9):1239–1242
Article Google Scholar
Contreras Robert F (2011) Long-term observations of tropical instability waves. J Phys Oceanogr 32(9):2715–2722
Article Google Scholar
Deser C, Wahl S, Bates JJ (1993) The influence of sea surface temperature gradients on stratiform cloudiness along the equatorial front in the Pacific Ocean. J Clim 6(6):1172–1180
Article Google Scholar
Düing W, Hisard P, Katz E, Meincke J, Miller L, Moroshkin KV, Philander G, Ribnikov AA, Voigt K, Weisberg R (1975) Meanders and long waves in the equatorial Atlantic. Nature 257(5524):280–284
Article Google Scholar
Evan W, Strutton PG, Chavez FP (2009) Impact of tropical instability waves on nutrient and chlorophyll distributions in the equatorial Pacific. Deep-Sea Res Part I 56(2):178–188
Article Google Scholar
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, pp 315–323
Google Scholar
Gorgues T, Menkes C, Aumont O, Vialard J, Dandonneau Y, Bopp L (2005) Biogeochemical impact of tropical instability waves in the equatorial Pacific. Geophys Res Lett 32
Google Scholar
Holmes R, McGregor S, Santoso A, England M (2019) Contribution of tropical instability waves to ENSO irregularity. Clim Dyn 52(3–4):1837–1855. https://doi.org/10.1007/s00382-018-4217-0
Article Google Scholar
Imada Y, Kimoto M (2012) Parameterization of tropical instability waves and examination of their impact on ENSO characteristics. J Clim 25(13):4568–4581
Article Google Scholar
Inoue R, Lien RC, Moum JN (2012) Modulation of equatorial turbulence by a tropical instability wave. J Geophys Res: Ocean
Google Scholar
Jochum M, Cronin MF, Kessler WS, Shea D (2007a) Observed horizontal temperature advection by tropical instability waves. Geophys Res Lett 34(9)
Google Scholar
Jochum M, Deser C, Phillips A (2007) Tropical atmospheric variability forced by oceanic internal variability. J Clim 20(4):765–771
Article Google Scholar
Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260
Article Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436
Article Google Scholar
Legeckis R (1977) Long waves in the eastern equatorial Pacific Ocean: a view from a geostationary satellite. Science 197(4309):1179–1181
Article Google Scholar
Legeckis R, Brown CW, Chang PS (2002) Geostationary satellites reveal motions of ocean surface fronts. J Marine Syst 37(1–3):3–15
Article Google Scholar
Liu C, K?Hl A, Liu Z, Wang F, Stammer D (2016) Deep-reaching thermocline mixing in the equatorial Pacific cold tongue. Nat Commun 7:11576
Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR)
Google Scholar
Lyman JM, Johnson GC, Kessler WS (2007) Distinct 17- and 33-day tropical instability waves in subsurface observations. J Physl Oceanogr 37(4):855
Article Google Scholar
Masina S, Philander S, Bush A (1999) An analysis of tropical instability waves in a numerical model of the Pacific Ocean: 2. Generation and energetics of the waves. J Geophys Res: Ocean 104(C12)
Google Scholar
Mathieu M, Couprie C, LeCun Y (2016) Deep multi-scale video prediction beyond mean square error 1511.05440
Google Scholar
Moum JN, Lien RC, Perlin A, Nash JD, Wiles PJ (2009) Sea surface cooling at the Equator by subsurface mixing in tropical instability waves. Nat Geosci 2(11):761–765
Article Google Scholar
Patil K, Deo MC (2018) Basin-scale prediction of sea surface temperature with artificial neural networks. J Atmos Ocean Technol 35(7):1441–1455
Article Google Scholar
Patil K, Deo MC, Ravichandran M (2016) Prediction of sea surface temperature by combining numerical and neural techniques. J Atmos Ocean Technol 33(8):1715–1726
Article Google Scholar
Polito PS, Ryan JP, Liu WT, Chavez FP (2001) Oceanic and atmospheric anomalies of tropical instability waves. Geophys Res Lett 28(11):2233–2236
Article Google Scholar
Qiao L, Weisberg RH (1995) Tropical instability wave kinematics: observations from the tropical instability wave experiment. J Geophys Res Ocean 100(C5):8677–8693
Article Google Scholar
Qiao L, Weisberg RH (1998) Tropical instability wave energetics: observations from the tropical instability wave experiment. J Phys Oceanogr 28(2):345–360
Article Google Scholar
Reichstein M, Camps-Valls G, Stevens B, Jung M, Denzler J, Carvalhais N, Prabhat (2019) Deep learning and process understanding for data-driven Earth system science. Nature 566(7743):195
Google Scholar
Ritchey N (2017) NCEI’s long term archive: infrastructure, processes, volume and trend. The 44th meeting of the working group on information systems & services (WGISS)
Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. Springer International Publishing
Google Scholar
Philander SGH (1976) Instabilities of zonal equatorial currents. J Geophys Res 81(21):3725–3735
Google Scholar
Shi X, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. MIT Press
Google Scholar
Tian F, Zhang R, Wang X (2018) A coupled ocean physics-biology modeling study on tropical instability wave-induced chlorophyll impacts in the Pacific. J Geophys Res: Ocean
Google Scholar
Tian F, Zhang R, Wang X (2019) A positive feedback onto ENSO due to tropical instability wave (TIW)-induced chlorophyll effects in the Pacific. Geophys Res Lett 46(2):889–897
Article Google Scholar
Tong L, Lagerloef G, Gierach MM, Kao H, Yueh S, Dohan K (2012) Aquarius reveals salinity structure of tropical instability waves. Geophys Res Lett 39
Google Scholar
Willett CS, Leben RR, Lavin MF (2006) Eddies and tropical instability waves in the eastern tropical Pacific: a review. Prog Oceanogr 69(2/4):218–238
Article Google Scholar
Wu A, Hsieh WW, Tang B (2006) Neural network forecasts of the tropical Pacific sea surface temperatures. Neural Netw 19(2):145–154
Article Google Scholar
Xie Shang-Ping (2004) Satellite observations of cool ocean atmosphere interaction. Bull Am Meteorol Soc 85(2):195–208
Article Google Scholar
Yang Y, Dong J, Sun X, Lima E, Mu Q, Wang X (2018) A CFCC-LSTM model for sea surface temperature prediction. IEEE Geosci Remote Sens Lett 15(2):207–211. https://doi.org/10.1109/LGRS.2017.2780843
Article Google Scholar
Yoder JA, Ackleson SG, Barber RT, Flament P, Balch WM (1994) A line in the sea. Nature 371(6499):689–692
Article Google Scholar
Zhang Q, Wang H, Dong J, Zhong G, Sun X (2017) Prediction of sea surface temperature using long short-term memory. IEEE Geosci Remote Sens Lett 14(10):1745–1749
Article Google Scholar
Zhang R (2014) Effects of tropical instability wave (TIW)-induced surface wind feedback in the tropical Pacific Ocean. Clim Dyn 42(1–2):467–485
Article Google Scholar
Zhang R (2016) A modulating effect of tropical instability wave (TIW)-induced surface wind feedback in a hybrid coupled model of the tropical Pacific. J Geophys Res: Ocean 121(10)
Google Scholar
Zhang R, Busalacchi AJ (2008) Rectified effects of tropical instability wave (TIW)-induced atmospheric wind feedback in the tropical Pacific. Geophys Res Lett 35(5):94–96
Article Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Satellite Ocean Environment Dynamics, Second Institute of Oceanography, Ministry of Natural Resources, Hangzhou, 310012, China
Gang Zheng & Bin Liu
CAS Key Laboratory of Ocean Circulation and Waves, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China
Xiaofeng Li & Ronghua Zhang
Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266237, China
Ronghua Zhang
College of Marine Sciences, Shanghai Ocean University, Shanghai, 201306, China
Bin Liu
Key Laboratory of Marine Ecological Monitoring and Restoration Technologies, Ministry of Natural Resources, Shanghai, 200137, China
Bin Liu

Authors

Gang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Ronghua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaofeng Li .

Editor information

Editors and Affiliations

CAS Key Laboratory of Ocean Circulation and Waves, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
Xiaofeng Li
CAS Key Laboratory of Ocean Circulation and Waves, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, Shandong, China
Fan Wang

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits any noncommercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if you modified the licensed material. You do not have permission under this license to share adapted material derived from this chapter or parts of it.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zheng, G., Li, X., Zhang, R., Liu, B. (2023). Forecasting Tropical Instability Waves Based on Artificial Intelligence. In: Li, X., Wang, F. (eds) Artificial Intelligence Oceanography. Springer, Singapore. https://doi.org/10.1007/978-981-19-6375-9_2

Download citation

DOI: https://doi.org/10.1007/978-981-19-6375-9_2
Published: 04 February 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6374-2
Online ISBN: 978-981-19-6375-9
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us