Non-linear Time Series Prediction using Improved CEEMDAN, SVD and LSTM

Poongadan, Sameer; Lineesh, M. C.

doi:10.1007/s11063-024-11622-z

Non-linear Time Series Prediction using Improved CEEMDAN, SVD and LSTM

Open access
Published: 06 May 2024

Volume 56, article number 164, (2024)
Cite this article

Download PDF

You have full access to this open access article

Neural Processing Letters Aims and scope Submit manuscript

Non-linear Time Series Prediction using Improved CEEMDAN, SVD and LSTM

Download PDF

Sameer Poongadan^1,2 &
M. C. Lineesh¹

534 Accesses
Explore all metrics

Abstract

This study recommends a new time series forecasting model, namely ICEEMDAN - SVD - LSTM model, which coalesces Improved Complete Ensemble EMD with Adaptive Noise, Singular Value Decomposition and Long Short Term Memory network. It can be applied to analyse Non-linear and non-stationary data. The framework of this model is comprised of three levels, namely ICEEMDAN level, SVD level and LSTM level. The first level utilized ICEEMDAN to break up the series into some IMF components along with a residue. The SVD in the second level accounts for de-noising of every IMF component and residue. LSTM forecasts all the resultant IMF components and residue in third level. To obtain the forecasted values of the original data, the predictions of all IMF components and residue are added. The proposed model is contrasted with other extant ones, namely LSTM model, EMD - LSTM model, EEMD - LSTM model, CEEMDAN - LSTM model, EEMD - SVD - LSTM model, ICEEMDAN - LSTM model and CEEMDAN - SVD - LSTM model. The comparison bears witness to the potential of the recommended model over the traditional models.

Meta-learning with Empirical Mode Decomposition for Noise Elimination in Time Series Forecasting

Financial Time Series Forecasting via CEEMDAN-LSTM with Exogenous Features

The most suitable mode decomposition technique for machine learning in meteorological time series prediction

Article 12 May 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In the present scenario, prediction of Non-linear time series is highly momentous. For modelling of stationary and linear time series ARIMA model is best [1,2,3]. But to model non-linear time series ARIMA is not suitable since it uses only linearly dependent variables, [4]. In [5], Badr et al. obtained forecasts of web traffic in internet using Holt’s linear trend, BATS and TBATS models. Numerous non-linear time series prediction models are developed by many researchers. Some models deploying Artificial intelligence such as Genetic Algorithm, [6, 7], Artificial Neural Network, [8,9,10], etc can be seen in the literature. Mahmoudzadeh predicted Carbon monoxide emanations by deploying Imperialistic Competitive ANN, [11]. Models using ANFIS can also be seen such as model in [12] for ejection of carbon monoxide, in [13] for the coefficient of heat transfer prediction in purified water pool steaming, in [14] for the prediction of concentration of daily atmospheric carbon monoxide, etc. A NARX model is suggested by Rosadi et al. to compare seasonal time series prediction learning algorithms, [15]. To forecast non-linear and non-stationary data, some prediction models incorporating Empirical Mode Decomposition (EMD) can also be seen in the literature [16,17,18] etc. Huang et al. introduced EMD [19]. But the utilization of the models deploying Ensemble Empirical Mode Decomposition (EEMD) will minimize the dominant pitfall of EMD, mode mixing. Wu and Huang introduced EEMD, [20]. Some researchers constructed models assimilating EEMD such as Bao et al. [21], Xie et al. [22], Jiang et al. [23] etc. P Sameer and MC Lineesh suggested a hybrid model combining EEMD, SVD and LSTM to forecast atmospheric CO in the Indian region, [24] and extended the work by deploying CEEMDAN instead of EEMD, [25]. They also developed a hybrid model combining EEMD, SVD and Moving Average for the prediction of atmospheric CO in the Indian region [26]. CEEMDAN, developed by Torres et al. [27], reduces mode mixing and computational cost as far as EMD and EEMD are concerned. But CEEMDAN has some drawbacks like the appearance of residual noise in the modes and the existence of specious modes. ICEEMDAN, developed by Colominas et al. [28], can effectively reduce such drawbacks. As in the case of EMD, EEMD and CEEMDAN, ICEEMDAN is also aims to break up the data into several Intrinsic Mode Functions (IMFs). IMF is a function with the conditions: (i) the difference between zero crossings and extrema in the plot of the series is at most one, (ii) the upper and lower envelopes determined by extrema are at equal space from the axis, [19].

Neural networks are artificial networks that can mimic the activities of the human brain by using an input-output preprocessing method. Haykin [29] provides an extensive review of neural networks. By choosing appropriate weights and activation functions, neural networks can be utilized to predict time series. One of the earlier works in applying neural networks to non-linear time series prediction can be seen in [30] by Eric A. Wan. The Computational Intelligence (CI) techniques are more flexible than traditional statistical models like the ARIMA and make fewer or no prior assumptions about the input variables. Additionally, according to [31], CI techniques are better equipped to handle outliers, missing data, and noisy data. Hence, CI techniques are frequently employed to portray complex and non-linear relationships in high-dimensional scenarios. Being the most typical model among the CI methods, Artificial neural networks (ANNs) have a significant role to play in the study and forecasting of time series as in [32,33,34]. There are various types of deep learning architectures, including the recurrent neural network (RNN) [35, 36], convolutional neural network (CNN) [37, 38], etc, that take into account various input data features. In the literature, we can see some other types of networks such as the dynamic neural network for Nonconvex Portfolio Optimization [39], a nonlinear neural circuit for Bluetooth-aided mobile phone localization [40] and Distributed Recurrent Neural Networks for Cooperative Control of Manipulators [41]. In general, CNNs are not well-suited for capturing temporal information present in input data. As a result, RNNs have emerged as the dominant choice in research fields that involve sequential data. RNNs are unable to connect the pertinent data when there is a significant gap between the input data. Hochreiter and Schmidhuber [42] introduced long short-term memory (LSTM) to manage the "long-term dependencies". In contrast to conventional RNNs, the LSTM network has the capability to effectively learn time series data that span long durations and automatically identify the most suitable time intervals for making predictions. The majority of impressive outcomes in deep learning, specifically related to RNNs, have been accomplished through the utilization of LSTM . As a result, LSTM has emerged as the central point of interest within the field of deep learning. Over the last decade, the LSTM network has demonstrated successful applications in various domains such as speech recognition [43, 44], sentence embedding [45], acoustic modeling [46], and trajectory prediction [47].

A new time series prediction model is introduced in this study amalgamating ICEEMDAN, SVD and LSTM. ICEEMDAN decomposed the data into several IMF components along with a residue and the components are de-noised using SVD after converting them into Hankel matrices. A Hankel matrix is one having the same entries on the skew-diagonals [48]. A Hankel matrix of order $l \times k$ can be expressed as

$$\begin{aligned} P=\left( \begin{array}{cccccc}q_1&{}q_2&{}q_3&{}.~~~~.~~~~. &{}q_k\\ q_2&{}q_3&{}q_4&{}.~~~~.~~~~. &{}q_{k+1}\\ q_3&{}q_4&{}q_5&{}.~~~~.~~~~. &{}q_{k+2}\\ .&{}.&{}.&{}&{}.\\ .&{}.&{}.&{}&{}.\\ .&{}.&{}.&{}&{}.\\ q_l&{}q_{l+1}&{}q_{l+2}&{}.~~~~.~~~~.&{}q_n \end{array} \right) , \end{aligned}$$

where $l+k-1=n$. LSTM network is deployed for prediction.

2 Methodology

2.1 Singular Value Decomposition (SVD)

A $p\times q$ matrix Q with real entries can be decomposed by SVD as:

$$\begin{aligned} Q = L~~D~~R^T \end{aligned}$$

such that $L_{p\times r}$ and $R_{q\times r}$ are orthogonal matrices. The right and left singular vectors are respectively given by the columns of R and L, [48]. The diagonal elements of the r-diagonal matrix

$$\begin{aligned} D=\left( \begin{array}{cc}S&{}0\\ 0&{}0 \end{array} \right) \end{aligned}$$

are called the singular values, where $S = diag(\sigma _1,\ \sigma _2,\ .\ .\ .\ )$ with $\sigma _1 \ge \sigma _2 \ge .\ .\ .\ > 0$.

The eigenvectors of $Q^*Q$ and $QQ^*$ are known as right and left singular vectors of Q respectively.

2.2 Empirical Mode Decomposition (EMD)

Huang et al. in 1998, suggested EMD [19]. Non-stationary and non-linear time series data can be analysed by EMD. The aim of EMD is to fragment the series into several IMF components along with a residue.

The steps of EMD are designed as:

a) Find each local extrema in the plot of series $x_{t}$.

b) Construct lower envelope $l_{m}$ by connecting all local minima and construct upper envelope $l_{M}$ by connecting all local maxima.

c) Determine the mean $a_{1_t}$ of $l_{m}$ and $l_{M}$,

i.e;

$$\begin{aligned} a_{1_t}=~\displaystyle (l_{m}~+~l_{M})~/~2 \end{aligned}$$

d) Determine $~i_{1_t}=~x_t~-~a_{1_t}$

e) If the conditions for an IMF are satisfied by $i_{1_t}$, allot $i_{1_t}$ to the first IMF $c_{1_t}$, and put the residue $~r_{1_t}=~x_t~-~i_{1_t}$ in place of actual series. Otherwise, $i_{1_t}$ replaces the actual series $x_t$.

f) Repeat steps (a) to (e).

The following can be used as stopping criteria for this process:

(i) the residue is a monotone function so that further IMF components extraction is not possible.

(ii) the number of extrema and zero crossings are equal in two successive sifting steps.

By the above process original series $x_t$ can be expressed as:

$$\begin{aligned} x_t=\sum _{j=1}^{n} \ {c_{j_t} +\ r_{n_t}}, \end{aligned}$$

where $c_{j_t}$ represents the IMF components and $r_{n_t}$ is the final residue which is a constant or a trend.

2.3 Ensemble Empirical Mode Decomposition (EEMD)

EEMD developed by Wu and Huang in 2009 [20] is assisted with added noise. The problem of mode mixing, the main pitfall of EMD, can be reduced by deploying EEMD.

The steps of EEMD are designed as:

a) Construct new data sets $x^j_n$ by adding several white noise realizations $w^j_n\ (j=1,\ 2,\ .\ .\ . ,\ N)$ to the original data $x_n$

i.e;

$$\begin{aligned} x^j_n=x_n+w^j_n \end{aligned}$$

b) Apply EMD to every $x^j_n\ (j=1,\ 2,\ .\ .\ . ,\ N)$ to break into components $IMF^j_k$, where $k=\ 1,\ 2,\ .\ .\ . ,\ K$.

c) By finding the mean of $IMF^j_k$ for $j=1,\ 2,\ .\ .\ . ,\ N$, deduce ${\overline{IMF}}_k$, the k-th mode.

2.4 Complete Ensemble Empirical Mode Decomposition With Adaptive Noise (CEEMDAN)

CEEMDAN, an extension of EEMD, is designed by Torres et al. in 2011 [27]. To eliminate mode mixing and reduce computational cost CEEMDAN is efficient over EMD and EEMD. When compared to EEMD, CEEMDAN is having a small number of sifting iterations. In CEEMDAN, at each phase of decomposition adaptive white noise is added and calculates only residual signal.

The steps of CEEMDAN are designed as follows:

For the series $x_n$, define r-th EMD mode of $x_n$ by $E_r(x_n)$. Let $w^j$ be white noise.

a) Use EMD to decompose N realizations $x_n+\epsilon _0 w^j_n,\ j=1,\ .\ .\ .,\ N,$ up to their first modes. Find out

$$\begin{aligned} {\widetilde{IMF}}_1=\frac{1}{N}\sum _{j=1}^{N}IMF_1^i={\overline{IMF}}_1 \end{aligned}$$

b) Find out the first residue as

$$\begin{aligned} r_1=x_n-{\widetilde{IMF}}_1 \end{aligned}$$

c) Apply EMD to the realizations $r_1+\epsilon _1E_1( w^j_n),\ j=1,\ .\ .\ .,\ N$ up to their first modes. Find out the second mode as

$$\begin{aligned} {\widetilde{IMF}}_2=\frac{1}{N}\sum _{j=1}^{N}E_1(r_1+\epsilon _1E_1( w^j_n)) \end{aligned}$$

d) Using the equation

$$\begin{aligned} r_m=r_{m-1}-{\widetilde{IMF}}_m, \end{aligned}$$

calculate m-th residue

e) Use EMD to decompose the realizations $~r_m~+~\epsilon _m~E_m( w^j_n), j=1,\ldots ,N$ and obtain their first modes. Find out ($m+1$)-th mode as

$$\begin{aligned} {\widetilde{IMF}}_{m+1}=\frac{1}{N}\sum _{i=1}^{N}E_1(r_m+\epsilon _mE_m( w^j_n)) \end{aligned}$$

f) Visit step (d) for next m

Steps (d) to (f) are executed up to the state at which the further fragmentation of residue is not possible. Thus the endmost residue is:

$$\begin{aligned} R~=~x_n-\sum _{m=1}^{M}{\widetilde{IMF}}_m. \end{aligned}$$

So, $x_n$ can be expressed as:

$$\begin{aligned} x_n=\sum _{m=1}^{M}{\widetilde{IMF}}_m+R. \end{aligned}$$

2.5 Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN)

ICEEMDAN, developed by Colominas et al. [28] in 2014, can deal with non-linear and non-stationary data. The drawbacks of CEEMDAN are the appearance of residual noise in the components and the existence of fictitious components. ICEEMDAN can effectively reduce the above drawbacks.

The steps of ICEEMDAN are designed as:

Let the data be x. The operator $E_j(x)$ is defined as the j-th EMD mode of x and M(x) as the local mean of signal x. Let $w^i$ be white noise and $\left\langle \ \ .~ \ \right\rangle $ the operation of finding the average all over the realizations.

a) Use EMD to create I realizations $x^{i}=~x~+~\beta _{0}E_1(w^{i})$ and find out their local mean to get first residue

$$\begin{aligned} r_1~=~\left\langle ~M(x^{i})~\right\rangle \end{aligned}$$

b) The first mode can be calculated as:

$$\begin{aligned} \widetilde{d_1}=~x~-~r_1 \end{aligned}$$

c) Averaging the local means of the realizations $r_1+\beta _{1}E_2(w^{i})$ obtain the second residue and the second mode can be obtained by subtracting the first residue from it:

$$\begin{aligned}{} & {} r_2=\left\langle M(r_1~+~\beta _{1}E_2(w^{i})) \right\rangle \\ {}{} & {} \widetilde{d_2}=r_1-r_2=r_1-\left\langle M(r_1~+~\beta _{1}E_2(w^{i})) \right\rangle \end{aligned}$$

d) For $k=3,~.~.~.,~K$ calculate k-th residue

$$\begin{aligned} r_k=\left\langle M(r_{k-1}~+~\beta _{k-1}E_k(w^{i})) \right\rangle \end{aligned}$$

e) The k-th mode can be obtained by

$$\begin{aligned} \widetilde{d_k}~=~r_{k-1}~-~r_k. \end{aligned}$$

f) Visit step (d) for next k.

2.6 SVD based Time series De-noising

Clean and noise parts of a time series can be separated by deploying SVD [49]. Let $x_n$ be a signal such that

$$\begin{aligned} x_n ~=~ x_{n_s} ~+~ w_{n_w}, \end{aligned}$$

where $x_{n_s}$ and $w_{n_w}$ the clean part and white noise part of $x_n$ respectively. In Hankel matrix representation, $x_n$ can be expressed as:

$$\begin{aligned} Q~=~ \left( \begin{array}{cccccc}x_1~~&{}x_2~~&{}.~~~~~~~.~~~~~~~. &{}x_k\\ x_2~~&{}x_3~~&{}.~~~~~~~.~~~~~~~. &{}x_{k+1}\\ .~&{}.&{}&{}.\\ .&{}.&{}&{}.\\ .&{}.&{}&{}.\\ x_l~~&{}x_{l+1}~~&{}.~~~~~~~.~~~~~~~.&{}x_j \end{array} \right) . \end{aligned}$$

As $x_n ~=~ x_{n_s} ~+~ w_{n_w}$, the Hankel depiction has the form

$$\begin{aligned} Q ~=~ Q_s ~+~ Q_{w}, \end{aligned}$$

where $Q,\ Q_s$ and $ Q_{w}$ are the Hankel depiction of actual, clean and the white noise signal respectively. The matrix Q is decomposed by SVD as:

$$\begin{aligned} Q ~=~ L~D~R^T, \end{aligned}$$

where L and R are orthogonal matrices and D is a diagonal matrix with diagonal elements as singular values.

The SVD separates the data matrix into noise and clean parts. Since the singular vectors spans the data matrix, noise part corresponds to the singular values near zero.

The clean and noise subspace separation can be represented as:

$$\begin{aligned} Q ~=~ L~D~R^T=\left( \begin{array}{cc}L_1~&~L_2 \end{array} \right) \left( \begin{array}{cc}D_1~&{}~0\\ 0~&{}~D_2 \end{array} \right) \left( \begin{array}{cc}~R_1^T~\\ R_2^T \end{array} \right) \end{aligned}$$

then

$$\begin{aligned} Q ~=~ L_1~D_1~R_1^T~+~L_2~D_2~R_2^T, \end{aligned}$$

where the singular values corresponding to clean and noise subspaces are respectively the diagonal entries of $D_1$ and $D_2$. Hence we have

$$\begin{aligned} Q_s ~=~ L_1~D_1~R_1^T \end{aligned}$$

and

$$\begin{aligned} Q_{w}~=~L_2~D_2~R_2^T. \end{aligned}$$

We need to determine a threshold in D such that the singular values lower than the threshold value corresponds the noise subspace and set them as zeros. By plotting the singular values against their index, the threshold can be found out as a point at which the slope deviates drastically.

The combined application of Hankelization and SVD establishes a sophisticated denoising methodology. This two-step approach leverages the strengths of both techniques, resulting in an enhanced ability to discern meaningful information from noisy components in complex time series datasets.

2.7 Long Short Term Memory (LSTM) network

The LSTM network, characterized by its architecture designed to handle time-dependent targets and inputs, stands as a formidable tool in the realm of deep learning. Its inherent capability to analyze and predict time series data arises from its adeptness in resolving long-term dependency issues. At the heart of the LSTM network lies the memory cell, which serves as the nucleus, orchestrating the network’s ability to capture intricate temporal patterns. The foundational principles and critical aspects of an LSTM network, integral to understanding its robust functionality, are eloquently detailed in [50]. This comprehensive depiction not only solidifies the theoretical underpinnings but also highlights the practical significance of leveraging LSTM networks for time series analysis and prediction. The memory cell, acting as the cornerstone of the LSTM architecture, enables the network to effectively capture, retain, and utilize temporal information, contributing to its superior performance in handling dynamic and time-dependent datasets.

2.8 ICEEMDAN - SVD - LSTM Prediction Model

This study combined ICEEMDAN, SVD and LSTM network to propose a new time series prediction model. The main levels of proposed models are ICEEMDAN level, SVD level and LSTM level. Several number of IMF components and a residue produces as the output of ICEEMDAN in level 1. level 2 applies SVD on the hankel representations of the IMF components and residue to de-noise them. Using LSTM network, forecast of each de-noised component is constructed in level 3. The integration of LSTM prediction in level 3 serves as a powerful conclusion to our methodology, providing valuable insights into the potential future trajectory of the time series based on the denoised components. To obtain the forecast of original data, all the above series to be added. Figure 1 illustrates the design of ICEEMDAN - SVD - LSTM model. This multi-level approach, encompassing ICEEMDAN-SVD denoising followed by LSTM forecasting, contributes to a comprehensive framework for analyzing and predicting complex time series data. The curve plotting and programming are done by Matlab 9.10.0.

3 Application of Methodology

3.1 Data

Wolf’s Sunspot Numbers represent a historical record of solar activity based on the observation of sunspots. While the specific index has evolved over time, it remains a crucial dataset for understanding long-term solar behaviour and its potential impact on Earth. To analyse the proposed prediction model, Wolf’s Sunspot Numbers from the year 1700 to 1988 are used. The data is non-linear and is of size 289. Figure 2 gives the time series plot of the data.

3.2 Segregation of IMF Components

In the first level, ICEEMDAN is used to decompose the given data into several IMF components. Here, ICEEMDAN started with 500 realizations, each created by adding the original data series with white noise standard deviation times the first EMD mode of the white noise. Maximum number of sifting iterations is set to 10. SNRFlag is set to 1. That is, SNR (Signal-to-Noise Ratio) increases for every stage. As the result of ICEEMDAN, seven IMF components namely IMF1, IMF2,..., IMF7 and a residue are obtained. This decomposition allows us to extract and analyze the underlying oscillatory modes and patterns within the time series data. The decomposed components, IMF1 through IMF7, capture various frequency bands, with IMF1 representing the highest frequency and IMF7 the lowest. In Fig. 3, these components are visually presented in their separation order, providing a clear depiction of the signal’s frequency content from higher to lower frequencies. Notably, the final component of the decomposition represents the overall trend of the time series. This trend component encapsulates the long-term behaviour or underlying patterns present in the original data.

3.3 De-noising the IMF Components by SVD

To enhance the denoising efficacy in level 2, each of the IMF components and the residue, obtained from the ICEEMDAN in level 1, undergoes an individual transformation into Hankel matrices. This transformation is a pivotal preprocessing step, setting the stage for a more effective application of SVD to mitigate noise. The conversion to Hankel matrices optimally organizes the temporal structure of each component, providing a coherent representation for subsequent analysis. This step ensures that the inherent patterns and characteristics within each IMF component and the residue are appropriately preserved before the denoising process begins. Subsequently, SVD is applied to all Hankel matrices. SVD breaks down the complex temporal structures into singular values, vectors, and matrices. The key advantage lies in the ability to identify and distinguish between the dominant patterns contributing to the data and the noise components that may be present.

The crux of the denoising process lies in the judicious selection of singular values. By isolating the Hankel representations corresponding to non-zero singular values, we focus on the essential patterns while effectively filtering out the noise. The series obtained from the Hankel representations corresponding to non-zero singular values will be the part of the data which is noise-reduced, emphasizing the preservation of significant information. This refined dataset, purged of unwanted noise, serves as a foundation for more accurate and reliable analyses. It not only improves the interpretability of underlying patterns but also ensures that subsequent modeling or forecasting efforts are based on a more robust and cleaner representation of the original time series.

3.4 Forecasting by LSTM

In level 3, the LSTM network is used to predict future values of each de-noised series. Leveraging the capabilities of LSTM, known for its proficiency in capturing temporal dependencies, allows us to model and predict the evolving patterns within the denoised time series. To facilitate this predictive modeling, the denoised series generated through the ICEEMDAN-SVD process serve as input sequences for the LSTM network. By training the LSTM on the first 90 percent of each denoised series and evaluating its performance on the remaining 10 percent, we ensure a robust assessment of its predictive accuracy. The training and testing split of 90-10 allows the LSTM network to learn from the majority of the historical data while providing a rigorous evaluation of unseen data, thus gauging its generalization capability.

The architecture of the LSTM network utilized in this study consists of several key components tailored to capture temporal dependencies and patterns within the input data. The input layer accepts input sequences with a specified number of features. In our case, as the input data is univariate time series data representing sunspot numbers, the input layer is configured with a single feature. In our implementation of the LSTM network, we employed a single LSTM layer with 500 hidden units. The choice of the number of hidden units is based on experimentation and aims to balance model complexity and predictive performance. We utilized the tanh activation function for the LSTM layer. Following the LSTM layer, a Fully Connected Layer is incorporated to map the LSTM outputs to a single output value. This layer facilitates the transformation of LSTM features into the desired output format for regression tasks. The regression layer serves as the output layer of the LSTM network, responsible for producing continuous-valued predictions. It computes the loss between the predicted values and the ground truth values during training, facilitating the optimization of the network parameters.

The LSTM network is trained using the Adam optimizer, a variant of stochastic gradient descent (SGD) known for its efficiency and effectiveness in training deep neural networks. The training process is iterated over a maximum of 250 epochs, controlling the duration of the training procedure. The initial learning rate is set to 0.005. After every 125 epochs, the learning rate is decayed by a factor of 0.2, facilitating smoother convergence and preventing overfitting.

The predictions generated by the LSTM network for each denoised series are then aggregated to produce an overall forecast of the actual data. This aggregation process consolidates the individual predictions from each ICEEMDAN-SVD component, resulting in a comprehensive prediction of the future values for the entire time series. In Fig. 4, the ICEEMDAN-SVD modes and their corresponding LSTM predictions are visually depicted, offering a clear illustration of the model’s forecasting performance. This visual representation enables a qualitative assessment of how well the LSTM captures the intricate temporal dynamics present in each denoised series, showcasing its effectiveness in predicting future values.

3.5 Comparison

The seven other models this study utilized to measure the efficacy of the recommended model are the LSTM model, EMD - LSTM model, EEMD - LSTM model, CEEMDAN - LSTM model, EEMD - SVD - LSTM model, ICEEMDAN - LSTM model and CEEMDAN - SVD - LSTM model. The LSTM model forecasted the data by applying LSTM directly. In the EMD - LSTM model, the application of EMD produced five IMFs and a residue which are predicted then by LSTM and the predictions are added to find the forecast of the actual series. The fragmentation of the data into eight IMF components and a residue is carried out first by EEMD in the EEMD - LSTM model. Then the forecast of actual data is obtained by summing the predictions of components and residue by LSTM. In the CEEMDAN - LSTM model, each of the eight IMF components and residue obtained from the series by CEEMDAN is predicted by LSTM and by adding them, the forecast of the original series is obtained. In the EEMD - SVD - LSTM model, all the eight IMF components and residue obtained by applying EEMD are de-noised by SVD before they are forecasted by LSTM. After the forecasting of all the above components, they are added to get the forecast of original series. In the ICEEMDAN - LSTM model, LSTM is applied to each of the seven IMF components and residue which are obtained as a result of ICEEMDAN and as before added their forecast series to obtain the forecast of the actual series. In CEEMDAN - SVD - LSTM model, CEEMDAN is performed first for the data and decomposed into eight IMF components and residue. Then SVD de-noised each components and residue. Then LSTM is applied to forecast each of the CEEMDAN - SVD modes and adding all the forecast series we obtained the forecast of the original series. The modes produced by EMD, EEMD, CEEMDAN, EEMD - SVD, ICEEMDAN and CEEMDAN - SVD and their forecasts by LSTM are depicted in the Figs. 5, 6, 7, 8, 9 and 10.

3.6 Performance Measures

To validate the efficiency of forecasting models many performance measures are used in the studies. Here we utilized Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) for the same. They are respectively defined as:

$$\begin{aligned}{} & {} RMSE~=~\sqrt{\frac{1}{m}\sum _{i=1}^{m}(y_i-\widehat{y_i})^2}\\{} & {} MAE~=~\frac{1}{m}\sum _{i=1}^{m}|y_i-\widehat{y_i}| \end{aligned}$$

4 Results and Discussion

In this study, a novel approach for predicting non-linear time series is introduced through the development of a hybrid ICEEMDAN - SVD - LSTM model. To evaluate its efficacy, the model is applied to analyze Wolf’s Sunspot Numbers, a canonical dataset renowned for its complexity and non-linear characteristics. There are 500 realizations used in the ICEEMDAN to convert the data into seven IMF components by setting the maximum number of sifting iterations to 10 and the Signal-to-Noise Ratio increases for every stage. In the application of SVD to the Hankel representation of each IMF component for getting noise subspace, the singular values not greater than 2.19861 are considered zero in the case of the Hankel matrix corresponding to IMF 1. The corresponding values in the case of the Hankel matrices corresponding to IMF2, IMF3,..., IMF7, and residue are respectively 27.7471, 9.77947, 3.90104, 4.04227, 6.2429, 1.38051 and 56.4937. To find these values, we plot the singular values of each Hankel matrix against their index and obtain a threshold at which the slope deviates drastically. The Hankel representation corresponding to non-zero singular values in each case is used as the denoised part of the corresponding series. Then these denoised parts are the forecasted by LSTM network as mentioned in the previous section. Aggregating all these forecasted series we obtained the forecast of the actual series.

In addition to our proposed model, we benchmarked its performance against seven established techniques: LSTM, EMD - LSTM, EEMD - LSTM, CEEMDAN - LSTM, EEMD - SVD - LSTM, ICEEMDAN - LSTM, and CEEMDAN - SVD - LSTM. The results of this comparative analysis are summarized in Table 1, showcasing the efficiency and effectiveness of our hybrid model. Furthermore, Fig. 11 presents a visual depiction of the observed and predicted Sunspot data, along with the corresponding error analyses conducted using each of the evaluated models. This comprehensive assessment provides valuable insights into the predictive capabilities of our proposed approach and its superiority in accurately capturing the dynamics of complex non-linear time series data.

Table 1 Comparison of the proposed model with other models

Full size table

5 Conclusion

Time series forecasting holds a pivotal role across diverse domains in the modern world, influencing decision-making processes and strategic planning in various sectors. The forecasting of non-linear time series is particularly crucial in the contemporary era, offering the potential to provide nuanced insights that can profoundly impact the future landscape of multiple industries. In response to the intricate challenges posed by non-linear time series forecasting, this study introduces a novel hybrid model: ICEEMDAN - SVD - LSTM. Comprising ICEEMDAN, SVD, and LSTM Network, this model exhibits a synergistic blend of established techniques to address the complexities inherent in non-linear time series data. The proposed hybrid model undergoes a rigorous comparative analysis against several existing models, including LSTM, EMD - LSTM, EEMD - LSTM, CEEMDAN - LSTM, EEMD - SVD - LSTM, ICEEMDAN - LSTM and CEEMDAN - SVD - LSTM. Through comprehensive experimentation and evaluation, our findings robustly affirm the superior efficiency of the ICEEMDAN - SVD - LSTM model. It consistently outperforms traditional models, showcasing its capacity to extract meaningful patterns, reduce noise, and accurately forecast future values in non-linear time series data.

This venture not only contributes a state-of-the-art forecasting model but also emphasizes the importance of adopting hybrid approaches that leverage the strengths of multiple techniques. As we navigate an era increasingly characterized by intricate data dynamics, the proposed model stands as a promising tool for advancing the precision and reliability of time series forecasting, thereby empowering decision-makers across diverse industries to make informed and strategic choices for the future.

The hybrid approach we have constructed is making use of ICEEMDAN, which is based on Empirical Mode Decomposition (EMD). EMD’s primary focus is on identifying the underlying oscillatory patterns within a time series, not on detecting structural breaks or abrupt shifts. Structural breaks often involve sudden changes in the statistical properties of the data, such as shifts in the mean, variance, or other characteristics. EMD may not be the most suitable technique for directly detecting structural breaks because it does not explicitly address changes in these statistical properties. So our proposed hybrid approach may not be suitable for the analysis of nonlinear time series with structural breaks. Incorporating in the proposed model the statistical methods such as the Chow Test, CUSUM Test, Bai-Perron Test, etc. for detecting the structural breaks will be a possible future work.

References

Wei WW( 2006) Time series analysis. The Oxford handbook of quantitative methods in psychology. Vol. 2
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, London
Google Scholar
Demongeot J, Oshinubi K, Rachdi M, Hobbad L, Alahiane M, Iggui S, Gaudart J, Ouassou I (2021) The application of arima model to analyze covid-19 incidence pattern in several countries. J Math Comput Sci 12:10
Google Scholar
Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14:35–62
Article Google Scholar
Badr A, Makarovskikh T, Mishra P, Abotaleb M, Al Khatib AMG, Karakaya K, Redjala S, Dubey A, Atta E (2021) Modelling and forecasting of web traffic using holt’s linear, bats and tbats models. Addison-Wesley. J Math Comput Sci 1:3887–3915
Google Scholar
Zhang Q, He X, Liu J (2003) Rbf network based on genetic algorithm optimization for nonlinear time series prediction. In: Proceedings of the 2003 international symposium on circuits and systems, ISCAS’03 (IEEE). Vol. 5: V–V
Yang CX, Zhu YF (2010) Using genetic algorithms for time series prediction. Sixth Int Conf Nat Comput (IEEE) 8:4405–4409
Google Scholar
Ahmad A, Azid I, Yusof A, Seetharamu K (2004) Emission control in palm oil mills using artificial neural network and genetic algorithm. Comput Chem Eng 28:2709–2715
Article Google Scholar
Azid I, Yusoff A, Seetharamu K, Ahmad A (2003) Application of back propagation neural network in predicting palm oil mill emission. ASEAN J Sci Technol Dev 20:71–86
Article Google Scholar
Mohebbi MR, Jashni AK, Dehghani M, Hadad K (2019) Short-term prediction of carbon monoxide concentration using artificial neural network (narx) without traffic data: case study: Shiraz city. Iran J Sci Technol Trans Civil Eng 43:533–540
Article Google Scholar
Mahmoudzadeh S, Othma Z, Yazdani M, Bakar A (2012) Carbon monoxide prediction using artificial neural network and imperialist competitive algorithm. J Basic Appl Sci 7:735–44
Google Scholar
Norhayati I, Rashid M (2018) Adaptive neuro-fuzzy prediction of carbon monoxide emission from a clinical waste incineration plant. Neural Comput Appl 30:3049–3061
Article Google Scholar
Das MK, Kishor N (2009) Adaptive fuzzy model identification to predict the heat transfer coefficient in pool boiling of distilled water. Expert Syst Appl 36:1142–1154
Article Google Scholar
Noori R, Hoshyaripour G, Ashrafi K, Araabi BN (2010) Uncertainty analysis of developed ann and anfis models in prediction of carbon monoxide daily concentration. Atmos Environ 4:476–482
Article Google Scholar
Rosadi D, Utami H, Hermansah A (2021) A comparison of learning algorithms for seasonal time series forecasting using narx model. J Math Comput Sci 11:6638–6656
Google Scholar
Wei Y, Chen MC (2012) Forecasting the short-term metro passenger flow with empirical mode decomposition and neural networks. Transp Res Part C Emerg Technol 21:148–162
Article Google Scholar
Chen CF, Lai MC, Yeh CC (2012) Forecasting tourism demand based on empirical mode decomposition and neural network. Knowl Based Syst 26:281–287
Article Google Scholar
Zhu B (2012) A novel multiscale ensemble carbon price prediction model integrating empirical mode decomposition, genetic algorithm and artificial neural network. Energies 5:355–370
Article Google Scholar
Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen NC, Tung CC, Liu HH (1998) The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc London A Math Phys Eng Sci 454:903–995
Article MathSciNet Google Scholar
Wu Z, Huang NE (2009) Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv Adapt Data Anal 1:1–41
Article Google Scholar
Bao Y, Xiong T, Hu Z (2012) Forecasting air passenger traffic by support vector machines with ensemble empirical mode decomposition and slope-based method. Discrete Dynamics in Nat Soc
Xie MQ, Li XM, Zhou WL, Fu YB (2014) Forecasting the short-term passenger flow on high-speed railway with neural networks. Comput Intell Neurosci 2014:23–23
Article Google Scholar
Jiang X, Zhang L, Chen XM (2014) Short-term forecasting of high-speed rail demand: a hybrid approach combining ensemble empirical mode decomposition and gray support vector machine with real-world applications in china. Transp Res Part C Emerg Technol 44:110–127
Article Google Scholar
Poongadan S, Lineesh MC (2021) A hybrid deep learning network for atmospheric carbon monoxide prediction in the Indian region. AIP Conf Proc (AIP Publishing LLC) 2336:020006
Article Google Scholar
Poongadan Sameer M, Lineesh C (2022) A hybrid deep learning network for non-linear time series prediction. J Math Comput Sci 12:158
Google Scholar
Poongadan S, Lineesh MC (2023) Forecasting of the true satellite carbon monoxide data with ensemble empirical mode decomposition, singular value decomposition and moving average. J Appl Stat 1–15
Torres ME, Colominas MA, Schlotthauer G, Flandrin P (2011) A complete ensemble empirical mode decomposition with adaptive noise. In: IEEE international conference on acoustics, speech and signal processing (ICASSP) (IEEE), pp 4144–4147
Colominas MA, Schlotthauer G, Torres ME (2014) Improved complete ensemble EMD: a suitable tool for biomedical signal processing. Biomed Signal Process Control 14:19–29
Article Google Scholar
Haykin S (1998) Neural networks: a comprehensive foundation. Prentice Hall PTR, New Jersey
Google Scholar
Wan EA (1993) Time series prediction by using a connectionist network with internal delay lines. In: Santa fe institute studies in the sciences of complexity-proceedings volume. Addison-Wesley publishing co. 15:195–195
Karlaftis MG, Vlahogianni EI (2011) Statistical methods versus neural networks in transportation research: differences, similarities and some insights. Transp Res Part C Emerg Technol 19(3):387–399
Article Google Scholar
Azid IA, Yusoff AR, Seetharamu KN, Ahmad AL (2003) Application of back propagation neural network in predicting palm oil mill emission. ASEAN J Sci Technol Dev 20(1):71–86
Article Google Scholar
Ahmad AL, Azid IA, Yusof AR, Seetharamu KN (2004) Emission control in palm oil mills using artificial neural network and genetic algorithm. Comput Chem Eng 28(12):2709–2715
Article Google Scholar
Mohebbi MR, Jashni AK, Dehghani M, Hadad K (2019) Short-Term prediction of carbon monoxide concentration using artificial neural network (NARX) without traffic data: case study: Shiraz City. Trans Civil Eng 43:533–540
Google Scholar
Werbos PJ (1988) Generalization of backpropagation with application to a recurrent gas market model. Neural Netw 1(4):339–356
Article Google Scholar
Van Lint JWC, Hoogendoorn SP, van Zuylen HJ (1811) 2002 Freeway travel time prediction with state-space neural networks: modeling state-space dynamics with recurrent neural networks. Transp Res Record 1:30–39
Google Scholar
Rawat W, Wang Z (2017) Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput 29(9):2352–2449
Article MathSciNet Google Scholar
Wang HZ, Li GQ, Wang GB, Peng JC, Jiang H, Liu YT (2017) Deep learning based ensemble approach for probabilistic wind power forecasting. Appl Energy 188:56–70
Article Google Scholar
Cao X, Li S (2023) A novel dynamic neural system for nonconvex portfolio optimization with cardinality restrictions. IEEE Trans Syst Man Cybernet Syst 53(11):6943–6952
Article Google Scholar
Li Shuai, Lou Yuesheng, Liu Bo (2014) Bluetooth aided mobile phone localization: a nonlinear neural circuit approach. ACM Trans Embed Comput Syst (TECS) 13(4):1–15
Article Google Scholar
Li Shuai, He Jinbo, Li Yangming, Rafique Muhammad Usman (2016) Distributed recurrent neural networks for cooperative control of manipulators: a game-theoretic perspective. IEEE Trans Neural Netw Learn Syst 28(2):415–426
Article MathSciNet Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
He T, Droppo J (2016) Exploiting LSTM structure in deep neural networks for speech recognition. In: In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5445–5449
Hsu WN, Zhang Y, Lee A, Glass J (2016) Exploiting depth and highway connections in convolutional recurrent deep neural networks for speech recognition. Cell 50(1)
Palangi H, Deng L, Shen Y, Gao J, He X, Chen J, Song X, Ward R (2016) Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Trans Audio Speech Lang Process 24(4):694–707
Article Google Scholar
Qu Z, Haghani P, Weinstein E, Moreno P (2017) Syllable-based acoustic modeling with CTC-SMBR-LSTM. In: 2017 IEEE automatic speech recognition and understanding workshop (ASRU), pp 173–17
Altché F, de La Fortelle A (2017) An LSTM network for highway trajectory prediction. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC), pp 353–359
Horn RA, Johnson CR (2012) Matrix analysis. Cambridge University Press, Cambridge
Book Google Scholar
Hassanpour H, Zehtabian A, Sadati S (2012) Time domain signal enhancement based on an optimized singular vector denoising algorithm. Digital Signal Process 22:786–794
Article MathSciNet Google Scholar
Xingjian S, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional lstm network: a machine learning approach for precipitation nowcasting. Adv Neural Inf Process Syst 28:802–810
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, National Institute of Technology Calicut, Calicut, Kerala, 673601, India
Sameer Poongadan & M. C. Lineesh
Department of Mathematics, P.S.M.O College, Tirurangadi, Malappuram, Kerala, 676306, India
Sameer Poongadan

Authors

Sameer Poongadan
View author publications
You can also search for this author in PubMed Google Scholar
M. C. Lineesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sameer Poongadan.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the author(s).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Poongadan, S., Lineesh, M.C. Non-linear Time Series Prediction using Improved CEEMDAN, SVD and LSTM. Neural Process Lett 56, 164 (2024). https://doi.org/10.1007/s11063-024-11622-z

Download citation

Accepted: 10 April 2024
Published: 06 May 2024
DOI: https://doi.org/10.1007/s11063-024-11622-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Non-linear Time Series Prediction using Improved CEEMDAN, SVD and LSTM

Abstract

Similar content being viewed by others

Meta-learning with Empirical Mode Decomposition for Noise Elimination in Time Series Forecasting

Financial Time Series Forecasting via CEEMDAN-LSTM with Exogenous Features

The most suitable mode decomposition technique for machine learning in meteorological time series prediction

1 Introduction