Introduction

Sudden cardiac death is defined as a death occurring usually within an hour of onset of symptoms, arising from an underlying cardiac disease. Sudden cardiac death is a complication of a number of cardiovascular diseases and is often unexpected [1, 2]. Clinical manifestations may include chest pain, shortness of breath, fatigue, weakness, persistent angina pectoris, arrhythmia, etc. [3]. At present, there are limited effective methods to predict the occurrence of SCD in individuals without prior cardiac issues. As a simple, easy-to-use, reliable ECG analysis tool, the electrocardiogram (ECG) provides abundant information for the diagnosis and treatment of cardiovascular disease. Based on ECG signals, abnormal and significant fluctuations can be detected in patients before the onset of SCD [4, 5]. For example, Ventricular Fibrillation (VF) is an important manifestation of sudden cardiac death, and the trend of VF can be obtained by monitoring ECG signals [6, 7]. However, ECG is a weak signal with strong nonlinearity, non-stationarity, and randomness, which affect the final diagnostic results. Therefore, accurate prediction of ECG signals plays a pivotal role in the early detection and prevention of Sudden Cardiac Death [8, 9].

At present, the traditional machine learning models are commonly applied for ECG prediction. These models make forecasting based on historical data, such as classification model and regression model [10,11,12,13]. For example, Liu et al. [14] developed a cardiac arrest classification model utilizing wavelet transform and the AdaBoost algorithm. This model effectively distinguishes cardiac arrest from ECG signals and predicts its occurrence with an impressive accuracy of 97.56% within 5 minutes before the event. Ebrahimzadeh et al. [15] employed a Multi-Layer Perceptron (MLP) to classify abnormal ECG signals, with the aim of predicting Sudden Cardiac Death (SCD). Their model demonstrates increasing prediction accuracy as it approaches the critical point of sudden death. Sengupta et al. [16] utilized Random Forest, Least Squares Discriminant, and Support Vector Machine in the classification of 12-lead ECG signals to predict abnormal myocardial relaxation and assess the likelihood of SCD. Hou et al. [17] presented a novel deep learning-based algorithm that combined an LSTM-based auto-encoder (LSTM-AE) network with support vector machine (SVM) for ECG arrhythmias classification. This method exhibits high accuracy, sensitivity, and specificity in classifying various heartbeat types, showcasing its potential for ECG arrhythmia classification. Kaya et al. [18] proposed an innovative approach that combines angle transform (AT) and LSTM for the automatic identification of congestive heart failure (CHF) and arrhythmia (ARR) using ECG signals. However, most of these methods achieve classification based on patients’ ECG signals and those of healthy individuals, which struggles to address dynamic system modeling problems related to time [19, 20].

With the rapid development of artificial intelligence, neural networks has been broadly applied in signal processing and achieves excellent performance. Jin et al. [21] proposed a regression model based on the Regularized Extreme Learning Machine (RELM) to predict ECG signals by analyzing the correlation between ECG and human gait. Zheng et al. [22] employed a traditional Echo State Network (ESN), a type of Recurrent Neural Network, to forecast ECG signals, yielding superior results compared to conventional regression machine learning techniques. Wang et al. [23] devised a cardiac beat prediction algorithm for SCD based on ECG, establishing a time series prediction model for dynamic human ECG signals to accurately anticipate ECG signal patterns. Sakib et al. [24] employed a Reservoir Computing (RC)-based ESN method for magnetocardiography (MCG) monitoring and helped to detect cardiac activity.

In recent years, some powerful sequence models have been proposed to assist with ECG analysis with their advantage of exploring time-frequency based features [25,26,27,28,29,30]. As one of the most commonly used sequence model, the Long Short-Term Memory (LSTM) network has been proven to be effective to track information over extended periods [31, 32]. For instance, Liu et al. [33] employed LSTM to predict influenza trends and achieved better results than linear models. Balci et al. [34] presented a hybrid Attention-based LSTM-XGBoost algorithm for detecting atrial fibrillation (AF) in long-recorded ECG data. Combined with preprocessing techniques, this method achieves a high accuracy, offering a reliable support system for clinicians and facilitating data tracking in long ECG record reviews.

However, traditional LSTM networks exhibit weak robustness and low prediction accuracy in complex tasks, as the memory cells store memories unrelated to the current time step [35, 36]. To address this issue, we propose an integrated approach combining data preprocessing and prediction model construction to predict ECG signal trends in advance. Firstly, data prepocessing is performed on the original ECG signals, including wavelet denoising, normalization, and phase space reconstruction. Then, the Nested LSTM model is utilized for signal prediction. At this step, an inner LSTM unit is adopted, which can guide the memory forgetting and memory selection of ECG signals, so as to improve the data processing ability and prediction accuracy of ECG signals.

This paper is organized as follows. In Introduction section, a brief introduction of the existing ECG signal prediction methods is made. A detailed algorithm and description of the proposed methods are presented in Theory and calculation section. Experiment and results section provides implementation details of the experiments. The effectiveness and superiority of the proposed method are verified through experiments and results analysis. Conclusion section summarizes this paper.

Theory and calculation

Model construction

The model construction proposed in this paper is presented in Fig. 1, including the data preprocessing strategy, the prediction model Nested LSTM and model evaluation.

Fig. 1
figure 1

Model construction framework diagram

Preprocessing methods

Considering the uncertainty and complexity of ECG signals, it would be a difficult to capture the trend of the data directly. Thus, the data preprocessing strategy is adopted to ensure the quality of data, after which some unwanted noise are removed from the ECG signals. Therefore, we employ a data preprocessing strategy to ensure data quality and remove unwanted noise from the ECG signals. The preprocessing methods include signal denoising, normalization, and phase space reconstruction.

Signals denoising

Due to the movement of human limbs, breathing, electromagnetic interference of the surrounding environment, ECG signals are accompanied by a lot of noise, including baseline drift, power frequency interference, electromyographic interference, and motion artifacts, which could have a certain impact on the prediction results. The frequency ranges and signal energy ranges of the four types of noise are as follows: (1) Baseline drift noise: the noise frequency is less than 5Hz; the energy range is between 0.01 Hz and 1 Hz. (2) Power frequency interference noise: the noise frequency range is 50 Hz or 60 Hz; the energy range is concentrated in frequency components near the power frequency. (3) Myoelectric interference noise: The noise frequency range is in the range of 5HZ 2000Hz. The energy range is between 10 Hz and 500 Hz, depending on the frequency of muscle activity. (4) Motion artifact noise: Noise frequency range is in the range of 3Hz 14Hz, depending on the frequency and amplitude of motion. The energy range is in the range of 5Hz 10Hz, reflecting the frequency components caused by the subject’s movement.

Therefore, it is necessary to denoise the ECG signals. Wavelet denoising method has the characteristics of multi-resolution analysis and has the ability to characterize the local characteristics of signals in both time and frequency domains. It is very suitable for analyzing nonstationary signals such as ECG signals and extracting the local characteristics of signals. Therefore, the wavelet denoising method is used to denoise the ECG signals in this paper. Its process is shown in Fig. 2. The specific steps are as follows:

  • Input the original ECG signals containing noise;

  • Decompose the original ECG signals in 7 layers through the wavelet transform in the wavelet denoising method, and the wavelet function selects DB6;

  • Extract wavelet coefficients of each layer, including approximation coefficients and detail coefficients;

  • Obtain the threshold of each layer by using unbiased likelihood estimation; that is, give a threshold L for each layer, calculate its likelihood estimate, and then minimize the likelihood of L to obtain the threshold of each layer. The details of determining the threshold are as follows:

    • Step 1: After squaring the wavelet coefficients of each resolution level, arrange them in order from small to large, and obtain the vector \(P=[P_1,P_1,...,P_N]\), where N represents the length of the wavelet coefficient.

    • Step 2: Calculate the risk vector R based on the vector P, and find the smallest \(R_i\) in the risk vector as the risk value. The formula is as follows:

      $$\begin{aligned} R=[R_1,R_1,...,R_N] \end{aligned}$$
      (1)
      $$\begin{aligned} R_i=[N-2*i+(N-i)*P_i+sum(P_k)]/N \end{aligned}$$
      (2)
    • Step 3: The threshold value L is calculated from the square of the wavelet coefficient \(P_i\) corresponding to the risk value \(R_i\):

      $$\begin{aligned} L=median(abs(W(j-1,k)))/0,6745*sqrt(P_i) \end{aligned}$$
      (3)
  • Denoise the decomposed 7-layer signals according to the selected threshold; if \({X_i}\) is the ECG signals of the i-th layer after denoising, and \({d_i}\) is the ECG signals of the i-th layer before denoising, the denoising method of each layer as follows:

    $$\begin{aligned} {X_i} = \left\{ \begin{array}{ll} {d_i},&{} {X_i} \ge {L_{i}} \\ 0,&{} {X_i} < {L_i} \end{array}\right. i = 1,2, \cdots ,{7} \end{aligned}$$
    (4)
  • Reconstruct the 7-layer signals through inverse wavelet transform; the reconstructed ECG signals is:

    $$\begin{aligned} X = {X_1} + {X_2} + \cdots + {X_{7}} \end{aligned}$$
    (5)

    We adopted the method of wavelet reconstruction after wavelet transform, that is, using the inverse wavelet transform method. The steps are as follows:

    • Step 1: For the highest level of detail coefficients and the lowest frequency approximation coefficients, use the inverse high pass filter and inverse low pass filter of the wavelet basis function for upsampling and convolution to obtain the reconstructed signal.

    • Step 2: For each level of detail and approximation coefficients, the inverse high pass filter and inverse low pass filter of the wavelet basis function are used for upsampling and convolution to obtain the reconstructed signal.

    • Step 3: Repeat steps 1 and 2 until all levels of refactoring are completed.

  • Output the denoised ECG signals.

Fig. 2
figure 2

Wavelet denoising flowchart

Normalization

In order to obtain better fitting results and prevent the divergence of training results, it is necessary to standardize the training set, and the standardized equation is:

$$\begin{aligned} X = \frac{X - \bar{X}}{\sigma } \end{aligned}$$
(6)

Where X denotes the training set, \(\overline{X}\) denotes the average value of the training set, and \(\sigma\) denotes the standard deviation of the training set.

Phase space reconstruction

In the model training, we aim to predict the ECG signals in the next time step based on the historical data. It is necessary to use prior data as the training data to predict ECG signals in future time steps. In this paper, the final training set is constructed by phase space reconstruction. If the reconstruction dimension is m, the time delay is tau, and the ECG signals of the normalized training set is \(X = \left[ \begin{array}{cccc} {x_1},&{x_2},&\cdots ,&{x_{n + 1 + (m - 1)tau}} \end{array}\right]\), then the reconstructed training set is:

$$\begin{aligned} X = \left[ \begin{array}{cccc} {x_1}, &{} {x_2}, &{} \cdots &{} {x_n}\\ {x_{1 + tau}}, &{} {x_{2 + tau}}, &{} \cdots &{} {x_{n + tau}} \\ \vdots &{} \vdots &{} &{} \vdots \\ {x_{1 + (m - 1)tau}}, &{} {x_{2 + (m - 1)tau}} , &{} \cdots &{} {x_{n + (m - 1)tau}} \end{array} \right] \end{aligned}$$
(7)
$$\begin{aligned} Y = \left[ \begin{array}{ccc} {x_{2 + (m - 1)tau}},&\cdots&{x_{n + 1 + (m - 1)tau}} \end{array}\right] \end{aligned}$$
(8)

The ECG signal in the input layer of the predition model are reconstructed from Equations (7) and (8). The construction rule is to start from the first sampling point of the selected ECG signal and use the \(1-st\) to \(99-th\) sampling points as the input sample, the \(100-th\) sample point is taken as the output sample, and so on. The input samples of the \(i-th\) training set are the \(i~i+98\) sampling points, and the output samples are the \(i+99-th\) sampling points, where \(i = 1, 2, ..., 5 000-99\). A total of 4901 input-output sample pairs are generated, which is the training data set of the model.

Nested LSTMs model

Information exhibits time correlation, and historical information can hold valuable clues for predicting future events. Traditional machine learning methods only have short-term memory, which has prediction limitations in the case of limited information. LSTM introduces ingenious controllable self circulation to generate a path that allows gradient to flow continuously for a long time, which makes it especially suitable for processing tasks related to time series and tracking information in a longer time. As a result, the extended models of LSTM have received increasing attention by virtue of the obvious advantages.

Nested LSTM shares the same input layer, hidden layer, and output layer as LSTM, and its unit structure is illustrated in Fig. 3. In this figure, a new inner LSTM structure is adopted to replace the memory cells of the traditional LSTM. When accessing the inner memory, they are gated in the same way. Therefore, the Nested LSTM can access the inner memory more specifically, which makes the Nested LSTM prediction model has stronger processing capabilities for ECG signals and higher prediction accuracy. This enhancement allows the Nested LSTM to capture and utilize more intricate temporal patterns in the data, making it well-suited for tasks that require detailed information processing and precise predictions in the context of electrocardiogram signals.

Fig. 3
figure 3

Nested LSTM model unit structure diagram

Nested LSTM is divided into inner LSTM and outer LSTM. The gating system of both inner and outer LSTM is consistent with that of traditional LSTM. Within this system, there are four gating systems: forget gate, input gate, candidate memory cell, and output gate. The calculation equations for each gate are as follows.

Forget gate:

$$\begin{aligned} {f_t} = \sigma ({W_{fx}}{x_t} + {W_{fh}}{h_{t - 1}} + {b_f}) \end{aligned}$$
(9)

Input gate:

$$\begin{aligned} {i_t} = \sigma ({W_{ix}}{x_t} + {W_{ih}}{h_{t - 1}} + {b_i}) \end{aligned}$$
(10)

Candidate memory cell:

$$\begin{aligned} \widetilde{{c_t}} = \tanh ({W_{cx}}{x_t} + {W_{ch}}{h_{t - 1}} + {b_c}) \end{aligned}$$
(11)

Memory cell: The input and hidden states of the inner LSTM are:

$$\begin{aligned} {\overline{h}_{t - 1}} = {f_t}\ \cdot \ {c_{t - 1}} \end{aligned}$$
(12)
$$\begin{aligned} {\overline{x}_t} = {i_t}\ \cdot \ \widetilde{{c_t}} \end{aligned}$$
(13)
$$\begin{aligned} \overline{f}_t = \sigma ({\overline{W}_{fx}}\overline{x_t} + {\overline{W}_{fh}}{\overline{h}_{t - 1}} + {\overline{b}_f}) \end{aligned}$$
(14)
$$\begin{aligned} \overline{i}_t = \sigma ({\overline{W}_{ix}}\overline{x_t} + {\overline{W}_{ih}}{\overline{h}_{t - 1}}+ {\overline{b}_i}) \end{aligned}$$
(15)
$$\begin{aligned} \widetilde{\overline{c}}_t = \tanh ({\overline{W}_{cx}}\overline{x_t} + {\overline{W}_{ch}}{\overline{h}_{t - 1}} + {\overline{b}_c}) \end{aligned}$$
(16)
$$\begin{aligned} \overline{o}_t = \sigma ({\overline{W}_{ox}}\overline{x_t} + {\overline{W}_{oh}}{\overline{h}_{t - 1}} + {\overline{b}_o}) \end{aligned}$$
(17)
$$\begin{aligned} \overline{c}_t = \overline{f}_t\ \cdot \ {\overline{c}_{t - 1}} + \overline{i}_t{\hspace{1.0pt}} \cdot {\hspace{1.0pt}} \widetilde{\overline{c}}_t \end{aligned}$$
(18)
$$\begin{aligned} \overline{h} _t= \overline{o}_t\ \cdot \tanh (\overline{c}_t) \end{aligned}$$
(19)

The memory cell update method of the outer LSTM is:

$$\begin{aligned} {c_t} = \overline{{h}}_t \end{aligned}$$
(20)

Output gate:

$$\begin{aligned} {o_t} = \sigma ({W_{ox}}{x_t} + {W_{oh}}{h_{t - 1}} + {b_o}) \end{aligned}$$
(21)

A new round of hidden state:

$$\begin{aligned} {h_t} = {o_t}\ \cdot \ \tanh ({c_t}) \end{aligned}$$
(22)

Where \(\sigma\) denotes the sigmoid function. In the outer LSTM, \({W_{fx}}\) and \({W_{fh}}\) denote the weight matrix of the forget gate; \({W_{ix}}\) and \({W_{ih}}\) denote the weight matrix of the input gate; \({W_{cx}}\) and \({W_{ch}}\) denote the weight matrix of the candidate memory cell; \({W_{ox}}\) and \({W_{oh}}\) denote the weight matrix of the output gate; \({b_f}\), \({b_i}\), \({b_c}\) and \({b_o}\) denote the bias of the forget gate, input gate, candidate memory cell, and output gate respectively. In the inner LSTM, \(\overline{x}_t\), \(\overline{h}_{t - 1}\) and \(\overline{c}_{t - 1}\) denote the current input, the hidden state and memory cell of the previous round, respectively. In the outer LSTM, \({\overline{W}_{fx}}\) and \({\overline{W}_{fh}}\) denote the weight matrix of the forget gate; \({\overline{W}_{ix}}\) and \({\overline{W}_{ih}}\) denote the weight matrix of the input gate; \({\overline{W}_{cx}}\) and \({\overline{W}_{ch}}\) denote the weight matrix of the candidate memory cell; \({\overline{W}_{ox}}\) and \({\overline{W}_{oh}}\) denote the weight matrix of the output gate; \(\overline{b}_f\), \(\overline{b}_i\), \(\overline{b}_c\) and \(\overline{b}_o\) denote the bias of the forget gate, input gate, candidate memory cell, and output gate respectively; \(\overline{x}_t\), \(\overline{h}_{t - 1}\) and \(\overline{c}_{t - 1}\) denote the current input, the hidden state and memory cell of the previous round, respectively.

The output of the output layer is:

$$\begin{aligned} {y_t} = \sigma ({W_{yh}}{h_t}) \end{aligned}$$
(23)

Where \({W_{yh}}\) denotes the weight matrix of the output layer.

The Nested LSTM model is a deep neural network that incorporates both feedforward and feedback mechanisms. The feedforward mechanism completes the forward calculation of the Nested LSTM using equations (6)-(20). In contrast, the feedback mechanism employs the error backpropagation algorithm to train and update various network parameters.

The training process of the feedback mechanism first needs to define the loss function:

$$\begin{aligned} {E_t}= & {} \frac{1}{2}{({y_t} - \overset{\wedge }{y_t}\ )^2}\nonumber \\ E= & {} \sum \limits _{t = 1}^T {{E_i}} \end{aligned}$$
(24)

Where \({E_t}\) denotes the error at time t, E denotes the total error, \({y_t}\) denotes the training value, and \(\overset{\wedge }{y_t}\) is the target value.

Fig. 4
figure 4

Training flowchart of prediction model

The weight matrix and the bias term of each gating system need to be updated by the loss function [31]. Its process is shown in Fig. 4 and the specific steps are as follows:

  • Initialize the parameters of the prediction model and set the error threshold.

  • Input the ECG signals training set.

  • Perform forward calculation according to equations (6)-(20) to obtain the output corresponding to the current input.

  • Define the loss function.

  • Solve the gradient of each weight according to the loss function, and then update the weight matrix according to the gradient guide and update the bias terms.

  • Judge whether the training error is less than the error threshold; if yes, skip to Step7; if not, skip to Step3.

  • End of training.

Experiment and results

Data source

In this study, ECG signals were obtained from the Sudden Cardiac Death Holter Database on the PhysioNet website [37], a resource for complex physiological and biomedical signal research. The database features 20 patient groups who experienced actual cardiac arrest and exhibited potential sinus rhythm, persistent rhythm, and atrial fibrillation prior to the event. Medical experts have meticulously annotated these ECG signals, identifying the onset of sudden cardiac death. The dataset for this study consists of 20 sets of 20 second predicted signals (20 seconds including 16 seconds before and 4 seconds after the SCD event). The detailed information of 20 groups ECG signal used in this paper are shown in Table 1. We randomly select a set of ECG signals as the training set, that is, randomly create a model for each subject. The construction rule is to start from the first sampling point of the selected ECG signal and use the \(1_{st}\) to \(99_{th}\) sampling points as the input sample, the \(100_{th}\) sample point is taken as the output sample, and so on. The input samples of the \(i_{th}\) training set are the \(i-i+98\) sampling points, and the output samples are the \(i+99th\) sampling points, where \(i = 1, 2, ..., 5 000-99\). A total of 4901 input-output sample pairs are generated, which is the training data set of the model.

Table 1 ECG Signals Information

Data preprocessing results

Utilizing all ECG signals as model input may introduce noise and decrease prediction accuracy. As a result, a wavelet denoising method is employed to remove noise. Denoising outcomes are displayed in Figs. 5, 6 and 7.

Fig. 5
figure 5

Comparison of 30.dat signal before and after denoising

Fig. 6
figure 6

Comparison of 31.dat signal before and after denoising

Fig. 7
figure 7

Comparison of 32.dat signal before and after denoising

From Figs. 5, 6 and 7, it is evident that the signals 30.dat and 31.dat exhibit noticeable baseline drift noise before denoising, while signal 32.dat exhibits both baseline drift and EMG noise. However, after denoising, the signals 30.dat-32.dat become more stable (as supported by Table 2). This demonstrates that the enhanced wavelet denoising method employed in this study effectively eliminates baseline drift and EMG signal noise, indicating its capability to denoise ECG signals.

Table 2 SNR results

This study assesses the denoising effect through visual and signal-to-noise ratio (SNR). The denoising results are further compared using SNR, a technical metric that measures the ratio of signal energy to noise energy. The specific definitions are as follows:

$$\begin{aligned} SNR = 10\log \frac{{\sum \limits _{n = 1}^N {{{[x(n)]}^2}} }}{{\sum \limits _{n = 1}^N {{{[{x_m}(n) - x(n)]}^2}} }} \end{aligned}$$
(25)

Where x(n) denotes the original signal, and \({x_m}(n)\) denotes the denoising signal. We can evaluate the denoising effect by comparing the SNR of ECG signal before and after denoising. The SNR results are shown in Table 2.

Table 2 demonstrates that a portion of the noise in the aforementioned ECG signals has been removed, as evidenced by the increased SNR before and after denoising. This finding more clearly indicates the presence of a significant amount of noise in the ECG signal, which could impact prediction outcomes. Consequently, it is necessary to employ a denoising method to process the ECG signal prior to making predictions.

Predicion results and analysis

The Nested LSTM model was used to predict the risk of actual cardiac arrest for the 20 groups of ECG signals. In the experiment, we trained the model 20 times in total, and the training time was between 43s-58s. A selection of prediction results is presented in Figs. 8, 9 and 10. The error is calculated as the difference between the true value and the predicted value.

Fig. 8
figure 8

One-step prediction fitting result and error result of 30.dat signal

Fig. 9
figure 9

One-step prediction fitting result and error result of 31.dat signal

Fig. 10
figure 10

One-step prediction fitting result and error result of 32.dat signal

Figures 8, 9, and 10 illustrate notable fluctuations in the ECG signal. The actual values are represented in blue, while the fitted predictions are shown in red. Impressively, the proposed method exhibits remarkable proficiency in capturing these variations and accurately predicting trends in the ECG signal. The experimental results highlight the potential of predicting sudden cardiac death (SCD) before its occurrence, which holds life-saving implications for patients.

There are several techniques for classifying the risk of SCD. Support Vector Machines (SVM) is a classical classification method that is widely used. Echo State Networks (ESN) and Long Short-Term Memory (LSTM) networks are adaptive data analysis methods that have been employed in SCD detection. Bidirectional LSTM (Bi-LSTM) is an improved method of LSTM, comprising forward and backward LSTM components, which enables the summarization of temporal information from both past and future contexts. To validate the performance of the Nested LSTM model, the four models mentioned above are compared. In order to quantitatively assess the performance of the predictive models, two types of error measurements are employed in the experiment: mean absolute error (MAE) and root mean square error (RMSE).

$$\begin{aligned} RMSE = \sqrt{\frac{1}{N}{\sum \limits _{n = 1}^N ({Y_n} - \widehat{Y_n} )^2}} \end{aligned}$$
(26)
$$\begin{aligned} MAE = \frac{1}{N}\sum \limits _{n = 1}^N {\left| {{Y_n} - \widehat{Y_n} } \right| } \end{aligned}$$
(27)

Where N denotes the predicted ECG signals length, \({Y_n}\) denotes the predicted value, and \(\widehat{Y_n}\) denotes the target value. The prediction error results are averaged. The prediction RMSE results of the above models are shown in Table 3, Figs. 11 and 12, the prediction MAE results of the above models are shown in Table 4, Figs. 13 and 14, and the prediction average results of the above models are shown in Table 5.

Table 3 RMSE results
Fig. 11
figure 11

RMSE results (a)

Fig. 12
figure 12

RMSE results (b)

Table 4 MAE results
Fig. 13
figure 13

MAE results (a)

Fig. 14
figure 14

MAE results (b)

Table 5 Average results

RMSE can reflect the overall error of signal prediction. According to Figs. 11, 12 and Table 3, it can be observed that the SVM model has the highest RMSE. The SVM model, which lacks a feedback structure and cannot obtain historical information characteristics, performs poorly in ECG signal prediction. The ESN model, despite having a feedback structure, is simple and also exhibits a large error. The LSTM and Bi-LSTM models have low RMSE values, with their results being quite similar. Nested LSTM enhances the memory function of LSTM and extracts historical information features more accurately. When compared to the SVM, ESN, LSTM, and Bi-LSTM models, the RMSE of Nested LSTM is the smallest.

MAE represents the average of the absolute error between the predicted value and the observed value. According to Figs. 13, 14 and Table 4, it can be seen that the SVM model also has the largest MAE, and the Nested LSTM has the smallest MAE among the five models.

According to Table 5, it can be seen that the SVM model has the largest average RMSE and average MAE, and the Nested LSTM model has the smallest average RMSE and average MAE. When compared to the SVM, ESN, LSTM, and Bi-LSTM models, the average RMSE of Nested LSTM is reduced by 61.1%, 41.2%, 20.5%, and 17.9%, respectively. And the average MAE of Nested LSTM is reduced by 88.7%, 78.2%, 25.8%, and 32.6%, respectively. In conclusion, the Nested LSTM model demonstrates a strong nonlinear mapping ability for ECG signals.

Conclusion

In this study, we present an early prediction method for sudden cardiac death (SCD) risk using Nested LSTM based on electrocardiogram (ECG) sequential features to predict a patient’s ECG signals. ECG prediction is an effective approach for the early prediction of SCD risk. One limitation of the traditional prediction methods is that it has a low predict accuracy for strong nonlinear ECG, which may turn out to be inappropriate for practical applications. Thus, it is highly desirable to develop an optimized ECG prediction model with a high prediction accuracy. Focusing on the timeliness and accuracy of prediction, this paper focuses on the nonlinear mapping capability of Nested LSTM for ECG signals. The memory cell of Nested LSTM is replaced by an inner LSTM, which has strong memory ability. To demonstrate the effectiveness and applicability of the proposed model, the ECG signals of 20 groups of actual cardiac arrest patients are taken for conducting the empirical study. The experiment results show that the proposed model achieves better performance in comparison with other four models.

The current similar methods include classification and regression techniques. References [14,15,16] employ classification models to anticipate abnormal and non-abnormal ECG patterns, whereas references [21,22,23] utilize regression models to align with ECG signal trends and identify ECG outliers. Distinguished from conventional classification techniques, this approach excels in forecasting the onset of SCD heartbeats, effectively capturing the dynamic, nonlinear, and non-stationary nature of time series, and adeptly accommodating the irregular trends in electrocardiogram signals. Furthermore, in contrast to traditional regression methodologies, the study devises an encompassing strategy that merges data preprocessing with predictive model development for ECG prediction. Empirical findings demonstrate a notable reduction in fitting errors, specifically in terms of RMSE and MAE, underscoring the efficacy of this novel methodology.

As for future work, we will study the multi-step prediction method of ECG signal characteristics, and use a series of deep learning methods and reinforcement learning methods to reduce multi-step prediction errors. More significantly, the practical applicability of ECG signal prediction methods will be verified in SCD diagnostic applications, potentially saving patients’ lives from SCD events.