1 Introduction

World Health Organization indicates that, each year, 17.3 million people die worldwide from cardiovascular disease [1]. In this context, the interest in noninvasive analysis techniques for the diagnosis of arrhythmias, such as the electrocardiogram (ECG), is increasing.

The ECG is an examination that records the variation of the electrical potentials of the cardiac muscle. ECG is composed of the P wave that corresponds to the electrical activity of the atria, by the Q, R, and S waves composing the QRS complex that corresponds to depolarization of the ventricles, by the T wave that registers the repolarization of the ventricles [2]. From the QRS complex, specifically the R waves (larger peaks of the ECG signals), it is possible to obtain R-R intervals which are the time difference between two R waves.

Conventionally, the R-R interval is used to data extraction from the ECG signal in order to diagnose different types of arrhythmia [38]. However, the analysis of the R-R intervals is not able to measure changes on other ECG waves, such as the distortions on P wave for atrial fibrillation (AF) [912]. Thereby, some studies segment the ECG signal [1317]. Nevertheless, the diagnosis is still not ideal. The present research proposes an arbitrary analysis to any segment of the ECG signal. This analysis is based on ECG variability and morphology. The morphological information of the ECG signal is obtained by the voltage (mV) variation investigation occurring to each heartbeat interval. This new method of data extraction will allow a beat-to-beat analysis. Unlike the R-R interval in which each heartbeat is associated with a single real number, the proposed method associates each heartbeat to a set of points, that is to say, to a vector.

In this study, we propose to investigate the morphological information occurring at each heartbeat interval using statistical moments. The statistical moments used were variance, skewness, and kurtosis. The improvement in performance is due to the information obtained from the voltage variability and the morphology of the ECG signal, i.e., in addition to the frequency modulation information used in the heart rate variability (HRV) calculation, we added the amplitude modulation. Thus, to classify different types of arrhythmias, we use two modulation information of the ECG signal (frequency and amplitude), not just one as in the calculation of HRV. Unlike the R-R interval, in which each heartbeat is associated with a single real number, the proposed method associates each heartbeat with a set of points, that is, a vector.

The ECG signals were obtained in the databases MIT-BIH Normal Sinus Rhythm, MIT-BIH Atrial Fibrillation, and MIT-BIH Arrhythmia. The classifiers used to evaluate the proposed method were linear discriminant analysis, k-nearest neighbors, and support vector machine.

2 Materials and methods

The proposed method is based on the observation that arrhythmias episodes change the morphology of the ECG signal. A block diagram is shown in Fig. 1, and each of the processing blocks is described in the following.

Fig. 1
figure 1

Block diagram of the segmentation and classification of ECG signals

2.1 Datasets

The ECG signals were obtained from the databases (DB): MIT-BIH Normal Sinus Rhythm (NSR) database contains 18 ECG recordings of approximately 24 h duration. Subjects included in this database had no significant arrhythmias; they include 5 men, aged 26 to 45, and 13 women, aged 20 to 50 [18]; MIT-BIH Arrhythmia database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects [19], and MIT-BIH AF contains 319 episodes of atrial fibrillation. The individual recordings have approximately a 10-h duration of 25 individuals [20].

Database records contain the rhythm types: atrial bigeminy, atrial fibrillation, atrial flutter, ventricular bigeminy, 2o heart block, idioventricular rhythm, normal sinus rhythm, nodal (A-V junctional) rhythm, paced rhythm, pre-excitation (WPW), sinus bradycardia, supraventricular tachyarrhythmia, ventricular trigeminy, ventricular flutter, and ventricular tachycardia.

2.1.1 Pre-processing

In the preprocessing step, the goal is to reduce contamination of different types of noise and artifacts in the ECG signal. Therefore, to perform this work, the following types of noise have been removed: a signal in the frequency of 60 Hz and its bandwidth below 1 Hz; baseline wander, a low-frequency (0.15 up to 0.3 Hz) noise that results from the patient inhaling and compels a baseline shifting of the ECG signals; electrode contact noise, noise that results from a deficiency in the contiguity between the electrode and skin, which adequately cuts off the measurement system from the subject; electrode motion artifacts, artifacts that result from variations in the electrode-skin impedance with electrode motion; muscle contractions, noise that results from the contraction of other muscles apart from the heart; electrosurgical noise, noise produced from other medical apparatus in the patient care circumstance at frequencies between 100 and 1 MHz; and instrumentation noise, noise produced by the electronic equipment utilized in the ECG measurements [21].

From the records of the MIT-BIH NSR, MIT-BIH Arrhythmia, and MIT-BIH AF bases, as described in Section 2, 50,000 healthy heartbeat and 50,000 heartbeat of people with AF were withdrawn. Less than 1%, at the beginning and at the end, of the ECG signals were excluded due to measurement error. The ECG signal was normalized, and the sampling frequency was set to 128 Hz with 12-bit resolution in a range of ±10 mV. Two or more cardiologists independently annotated each record; disagreements were resolved to obtain the computer-readable reference annotations for each beat (approximately 110,000 annotations in all) included with the database.

2.2 Data extraction of the heartbeat

The data extraction of the ECG signal proposed in this study is carried out by analyzing the voltage variation on each heartbeat and is given by [22]

$$ B=(b_{\text{start}}, b_{2}, \cdots, b_{\text{end}}), $$
(1)

where B is a heartbeat, bstart and bend are given by

$$ b_{\text{start}}=P_{R}-F_{s}\lambda, $$
(2)

and

$$ b_{\text{end}}=P_{R}+F_{s}\theta, $$
(3)

where PR is the position of the R peak (PR are found in annotation files in MIT-BIH database), Fs is the sampling frequency, and λ and θ are the proportion weights of the heartbeat, being λ+θ≤1. The parameters λ and θ are heuristically assigned and function as sliding windows on the heartbeat.

2.3 Method’s generalization

The method presented in the previous Section 2.2 specifically analyzes heartbeat. Nevertheless, an arbitrary analysis to any segment of the heartbeat is defined as

$$ X=(x_{\text{start}}, x_{2}, \cdots, x_{\text{end}}), $$
(4)

where X is any segment of a heartbeat. If the segment of interest in the heartbeat begins before the peak of the R wave, Eq. 2 must be used, but if the segment of interest starts after the peak of the R wave, Eq. 3 should be used. Both equations are heuristically adjusted. The end of an arbitrary segment in the ECG signal, from a specific part in a given heartbeat or even a succession of heartbeats, is given by

$$ x_{\text{end}}=\mathbf{t}_{a}F_{s}, $$
(5)

where ta is the arbitrary period in the heartbeat or the ECG signal.

Another point of interest within the ECG is the peak of the P, QRS, and T waves. Therefore, the peak of the waves is defined as

$$ P_{X}=\text{max}(X). $$
(6)

Finally, a heartbeat can be defined as

$$ {{}\begin{aligned} B = P \bigcup PQ \bigcup QRS \bigcup ST \bigcup T= \end{aligned}} $$
(7)
$$ {{}\begin{aligned} = (p_{\text{start}}, p_{2}, \cdots, P_{P}, \cdots, p_{\text{end}}) \bigcup ({pq}_{\text{start}}, {pq}_{2}, \cdots, {pq}_{\text{end}}) \end{aligned}} $$
(8)
$$ {{}\begin{aligned} \bigcup ({qrs}_{\text{start}}, {qrs}_{2}, \cdots, P_{\text{QRS}}, \cdots, {qrs}_{\text{end}}) \bigcup ({st}_{\text{start}}, {st}_{2}, \end{aligned}} $$
(9)
$$ {{}\begin{aligned} \cdots, {st}_{\text{end}}) \bigcup (t_{\text{start}}, t_{2}, \cdots, P_{T}, \cdots, t_{\text{end}}). \end{aligned}} $$
(10)

2.4 Feature extraction of the heartbeat

Modulation of the ECG signal can be performed in time (or frequency) and in amplitude (or energy). For frequency modulation, HRV is used, and for amplitude modulation, presented in Section 2.3, the voltage variation of the ECG signal was used. Thus, to classify different types of arrhythmias, we use two modulation information of the ECG signal, unlike several authors that only use the time modulation (HRV) [2328]. The use of the two modulations allows a greater characterization of the ECG signal, improving the quality of the classification of arrhythmia. The variance, skewness, and kurtosis were used in this study to extract characteristics of both the ECG signal modulations.

  • Variance

    $$ \sigma^{2}_{X} = E \left(X^{2} \right)-(E(X))^{2}; $$
    (11)
  • Skewness

    $$ \gamma_{X} = E\left[(X-E(X))\sigma^{-1}\right]^{3}; $$
    (12)
  • Kurtosis

    $$ \kappa_{X} = E\left[(X-E(X))\sigma^{-1}\right]^{4}. $$
    (13)

Proposed method will be evaluated in a generalist (arrhythmia classification) and specialist (AF classification) manner.

2.5 Performance evaluation

The new method of ECG data extraction will be evaluated based on the ECG window for arrhythmia classification. Metrics for evaluation are specificity (SPEC—how efficient is the method for diagnosing healthy patients), sensibility (SENS—how efficient is the method for diagnosing patients with arrhythmias), and accuracy (ACC—how efficient is the method regarding the diagnosis).

The sensitivity and specificity are defined, respectively, given by

$$ \text{SPEC} = \frac{TN}{TN+FP} \times 100. $$
(14)

and

$$ \text{SENS} = \frac{TP}{TP+FN} \times 100, $$
(15)

And the accuracy is given by

$$ \text{ACC} = \frac{TP+TN}{TP+TN+FN+FP} \times 100 $$
(16)

where TP is the true positive, TN is the true negative, FP is the false positive, and FN is the false negative.

3 Results

The experiments were conducted using 150,000 heartbeats (50,000 healthy heartbeats, 50,000 heartbeats of people with arrhythmia, and 50,000 heartbeats of people with AF). In the experiments, we used 80% of the patients for training (16 healthy patients, 41 patients with arrhythmia, and 20 patients with AF) and 20% of the patients for testing (2 healthy patients, 6 patients with arrhythmia, and 3 patients with AF). The classification of heartbeats was performed based on the following: the linear discriminant analysis (LDA) was used in its linear version; k-nearest neighbors (k-NN) ranged in up to 2–5 nearest neighbors, and support vector machine (SVM) used the polynomial kernel function. Heartbeat obtained according to Section 2.1 and the parameters used were shown in Table 1.

Table 1 Parameters used in the ECG window

Table 2 shows the average accuracy of the classifiers (LDA, k-NN, SVM) for 4 different ECG windows (heartbeat, P wave, QRS complex, T wave).

Table 2 ECG window by the method proposed in 5 different cases

Figure 2 illustrates the heartbeat, P wave, QRS complex, T wave, and PQ segment processed by the proposed method according to the parameters described in the Table 1

Fig. 2
figure 2

Proposed method for data extraction of the heartbeats. a Healthy, b arrhythmia, c P wave, d P wave, e QRS complex, f QRS complex, g T wave, h T wave, i P-Q interval, j P-Q interval

Table 3 shows the average accuracy of the classifiers using variance, skewness, and kurtosis in the diagnosis of arrhythmia and AF. The results in this study are evaluated using 10-fold cross-validation, that is, definitive classification is the mean of the metrics (specificity, sensitivity, and accuracy) for ten different sets of training and testing.

Table 3 Average accuracy of the classifiers (LDA, k-NN, SVM) using the proposed data extraction

Figure 3 illustrates the dispersion to features extracted by the proposed method for each heartbeat.

Fig. 3
figure 3

Dispersion of the features extracted by the proposed method (variance, skewness, and kurtosis) for each heartbeat. a Features of healthy patients and patients with arrhythmia. b Features of healthy patients and patients with AF

4 Discussion

ECG window was efficient in healthy heartbeat (Fig. 2a) in all evaluated cases (heartbeat, P wave, QRS complex, T wave, PQ segment), and this behavior is a consequence of signal uniformity; indeed, there is little or no difference in signal morpholog, differently from the heartbeat with arrhythmia (Fig. 2b). P wave with arrhythmia (Fig. 2d) present deformations compared to healthy P wave (Fig. 2c), a consequence of the arrhythmia’s associated with atrium (e.g., atrial fibrillation, atrial flutter) [2932]. Ventricular diseases, such as heart block, cause irregularities in the electrical activity of the ventricles, which is evident in QRS complexes with arrhythmia (Fig. 2f), a characteristic not observed in healthy patients (Fig. 2e) [3336]. The morphology of the T wave is completely apparent in healthy heartbeats; however, in heartbeats with arrhythmia, the T wave is deformed [3740]. The PQ segment is completely visible in healthy heartbeats (Fig. 2i); nevertheless, due to increased heart rate, the PQ segments are reduced in heartbeats with arrhythmia (Fig. 2j), in some cases a consequence of the AV blocked [41].

The proposed method proved to be efficient in solving global (accuracy is up to 99.78% in the arrhythmia classification) and local (accuracy of 100% in the AF classification) heartbeat problems. The improvement in performance is due to the information obtained from the voltage variability and the morphology of the ECG signal, i.e., in addition to the frequency modulation information used in the HRV calculation, we added the amplitude modulation. The variation of k values (2–5 nearest neighbors) in the classification of the heartbeats was lower than 0.07%. The accuracy rate of the moments (variance, skewness, and kurtosis) when used separately is up to 9.82% and together is up to 15.38% when compared to the R-R interval (Table 3). The invariance in the average accuracy of the classifiers used (LDA, k-NN, SVM) is due to separability between healthy and with arrhythmia patients (Fig. 3a) and also between healthy and with AF patients (Fig. 3b).

The separability between healthy and those with arrhythmia patients (Fig. 3a), also between healthy and those with AF (Fig. 3b), is due to the information obtained from the voltage variability and the morphology of the ECG signal, i.e., in addition to the frequency modulation information used in the heart rate variability (HRV), we added the amplitude modulation. Thus, to classify different types of arrhythmias, we use two modulation information of the ECG signal (frequency and amplitude). In addition, the value in the right part of the image (cross red) corresponds to a single beat that is located at the beginning of the ECG signal. Of course, the 1% rejection window used in this work was not large enough to exclude this beat.

Table 4 compares the proposed methodology with the performance in the literature for arrhythmia and AF classification.

Table 4 Sensitivity, specificity, and accuracy of the proposed methodology compared to the literature for arrhythmia and AF classification, evaluated for MIT-BIH Normal Sinus Rhyth, MIT-BIH Arrhythmia, and MIT-BIH Atrial Fibrillation databases

Table 4 shows that even when compared with linear techniques such as SVM or nonlinear techniques like neural network, this study showed superior results in the arrhythmia classification (accuracy of 99.78%) and AF classification (accuracy of 100%). The improvement in performance is due to the information obtained from the voltage variability and the morphology of the ECG signal, i.e., in addition to the frequency modulation information used in the HRV calculation, we added the amplitude modulation. Thus, to classify different types of arrhythmias, we use two modulation information of the ECG signal (frequency and amplitude), not just one as in the calculation of HRV.

4.1 Limitation

The proposed data extraction is measured by statistical moments. The statistical moments are dependent on the average of the samples; actually, a large number of outliers can complicate the analysis of the data. In addition, the temporal information of the ECG signal can not be obtained directly, that is, it is not trivial to verify changes in the heart rate by the proposed method.

5 Conclusions

The main implication of this study is the complete analysis of the heartbeat. The critical factor that improved performance was the information obtained from the voltage variability and ECG morphology rather than the classifiers (LDA, k-NN, SVM). Unlike the R-R interval, in which each heartbeat is associated with a single real number, the proposed method associates each heartbeat with a set of points, that is, a vector. This factor provides an information gain so that techniques for extracting characteristics based on signal statistics are able to obtain the presented results. Therefore, it is enough to know the duration time of the region of interest (Table 1). Furthermore, the simplicity of the proposed method allows for application in imbedded systems similar to the 24-h Holter. Thus, this system will be able not only to record the electrical activities of the heart and its variations, but will also provide a prognosis for various arrhythmias. Our next challenge is to implement and evaluate the proposed method in real time.