1 Introduction

Cardiovascular diseases (CVDs) continue to be the main cause of death, especially in developed countries [43, 44, 46]. Prevention, through early and accurate diagnosis, could help to reduce mortality.

The detection task consists in identifying the beat positions in the original signal. It can be done at the individual-lead level (single-channel detection) [2, 31, 48], or by combining the information retrieved from all leads (multi-channel detection) [4, 17, 50].

The different beat processes reflect their function through oscillations in the signal (beat waves [18, 30, 41]). Atrial depolarization is represented by a small upward deflection (P wave). The main beat section is the QRS complex (Q, R, and S waves), and corresponds to ventricular depolarization. Ventricular repolarization can be studied through the T wave. Data related to the duration and amplitude of the different waves are then used to determine possible anomalies in the heart function. Computer-Aided Diagnosis (CAD) systems use this information to aid practitioners in their daily clinical practice. However, the use of this type of system is not widespread. One possible drawback could be the fact that they are not fully autonomous, and cannot replace human decisions. Physicians should evaluate the results obtained, and the final diagnosis must be based on their expertise and opinion. Their main goal is to help experts, thus reducing the amount of time needed to study a test, and not to replace the practitioner.

Existing CAD systems are varied [1, 9, 10, 13, 19, 32]. Although, the actual software tools are usually focused on one specific disease or type of disease [1, 9, 10, 13, 32]. In addition, many of them must be used embedded in other systems, or as an add-on, and lack an easy-to-use GUI [1, 10], or have a simple one.

In this paper, ElectroCardioGram Diagnosis Tool (ECGDT), an open-source graphical software tool is presented (webpage: https://milegroup.github.io/RECGDT/). This system focuses on the detection of six common heart abnormalities: bradycardia (abnormally slow heart rhythm [20]), ischemia (restriction of blood to heart tissues, causing a shortage of oxygen [37]), myocardial infarction (cell death of cardiac myocytes [42]), tachycardia (abnormally fast heart rhythm [45]), ventricular hypertrophy (increase in the thickness of the ventricular walls [14]), and Wolff-Parkinson-White syndrome (existence of an accessory connection between the atria and the ventricle [39]).

2 Methods

The main objective of ECGDT is to give probabilities for the presence of the aforementioned cardiac diseases, and assist practitioners with their decisions.

This process involves four main steps: (1) heart beats are detected at each channel; (2) multi-channel beat positions are then obtained; (3) waves are delineated inside beats; and (4) information for wave heights and amplitudes is used to perform diagnosis (Fig. 1).

Fig. 1
figure 1

Diagnosis flow inside ECGDT

2.1 Programming language

The tool has been developed with the open-source idiom R and the corresponding IDE RStudio [38] for the core functionalities. The R package Shiny [40] was employed to implement the GUI. To improve efficiency, C code was also used for certain tasks.

2.2 Functionalities

2.2.1 Signal loading and exporting

Loading ECG record data, in ASCII or WFDB [11] formats is allowed. In addition, exporting data to ASCII files (optimized to require 40% less space than other equivalent files obtained from PhysioToolkit [11] utilities), downloadable to the user’s computer is also possible.

2.2.2 Signal representation

Graphical representation of signals can be done. Users can either select a continuous representation of the signal, refreshed by intervals of 5000 samples each 1500 milliseconds, or specify a channel and signal segment to plot. The interface also allows pausing the representation, and manually move forward and backward along the signal.

2.2.3 Beat detection

Single-channel beats detection on all the leads in the record is available. QRS complexes’ positions are determined based on the use of specific thresholds over the signal slope. For this task, an enhanced version of the Pan-Tompkins [31] algorithm has been devised [24, 25].

Beat positions can be exported to ASCII files (for an individual channel or all the channels of the record).

Detected beats can also be shown continuously or in segment representation.

Similarly, multi-channel or global beat detection can be performed. A global beat position is obtained by combining all the positions detected for it in the different leads available in the record [23]. These positions are computed to obtain a unique, more precise value to locate the beat.

Users can also represent multi-channel detected beats graphically. Besides, two new tools were added: continuous signal and segment comparison between mono and multi-channel detection.

2.2.4 Average beat

If desired, users can obtain an average (or basal) beat, a representative for a given signal segment. In this functionality, users can select the lead, beat number, the quantity of beats to use before and after the given one, and the length of the signal before and after the position of the R wave of each beat. ECGDT creates an average beat by combining all these segments of the signal into one, using a proprietary algorithm.

For the definition of the average beat, default values were empirically determined as 200ms before and 500ms after the beat position, and 5 beats on each side of the medial one. Sample cross-correlations with lag, between the central beat and each other, are also applied. The final value for each sample will be determined using a weighted method. Beats with higher correlation values with the main beat will have a higher weight in the final value of the sample. Using this technique, the system can avoid the influence of false positives or anomalous beats when obtaining a representative for a given segment. One example of basal beat obtained with ECGDT can be seen in Fig. 2. These beats are also processed to eliminate noise and trends present in the original signal.

Fig. 2
figure 2

Example of a basal beat (grey) compared to the equivalent original one (red)

2.2.5 Wave delineation

Wave delineation to identify the beginning, end, and peak positions for the main waves inside the beat can be performed.

Delineation uses wavelet transforms, with a technique inspired by [21] and described in [22]. Basically, quadratic spline wavelet transforms are applied to the average beats, to obtain the 3 first scales. QRS fiducial points are detected by the presence of maxima and zero-crossing values in scales 1 and 2, and P and T waves are searched similarly in scale 3 and restricted to specific intervals.

2.2.6 Diagnosis

Six diseases can be identified by ECGDT. Employing acquired knowledge, as events amplitude and duration data, the probability of the presence of bradycardia, ischemia, myocardial infarction, tachycardia, ventricular hypertrophy, and Wolff-Parkinson-White syndrome is established [26].

For this task, probability scores (P) for each of the six medical conditions previously indicated were calculated. They were based on specific diagnostic rules, obtained from features of beat waves and duration. First, these diagnostic rules had to be defined. In this way, some normality parameters were extracted as a compendium from different sources [6, 7, 27,28,29]. Specifically, the following rules were considered:

  • Rule #1: RR interval duration (RR.Duration) equal or above 0.6 s.

  • Rule #2: RR interval duration (RR.Duration) equal or below 1.2 s.

  • Rule #3: PR interval duration (PR.Duration) equal or above 0.12 s.

  • Rule #4: PR interval duration (PR.Duration) equal or below 0.22 s.

  • Rule #5: QRS complex duration (QRS.Duration) equal or below 0.12 s.

  • Rule #6: Q wave duration (Q.Wave.Duration) below 0.03 s.

  • Rule #7: Q wave height (Q.Wave.Height) between 25 and 33% of the R wave height (R.Wave.Height).

  • Rule #8: R wave height (R.Wave.Height) increases progressively from lead V1 to lead V5, and decreases from lead V5 to lead V6.

  • Rule #9: R wave height (R.Wave.Height) in lead V1 equal or below 0.5mV, if S Wave is minor than R wave.

  • Rule #10: QTc interval duration (QTc.Duration) equal or below 0.47 s.

  • Rule #11: ST segment depression (ST.Depression) below 0.05 mV.

  • Rule #12: ST segment depression (ST.Depression) in leads V4, V5, and V6.

  • Rule #13: ST segment elevation (ST.Elevation).

  • Rule #14: T wave with same sign (T.Wave.Sign) as QRS complex.

Once the diagnostic rules were established, Generalized Additive Models (GAMs) were constructed to obtain P for each abnormality. GAMs can be considered as an extension of the Generalized Linear Models (GLMs), which allow modeling nonlinear relationships between predictive variables and the response variable. In addition, GAMs use smooth functions to capture the nonlinear relationships, that can be completely flexible to fit data [12, 47].

The general expresión for a GAM is as follows:

$${\mathrm g}(\mu)={\mathrm\beta}_0+{\mathrm f}_1({\mathrm X}_1)+\dots+{\mathrm f}_{\mathrm n}({\mathrm X}_{\mathrm n})$$

where n is the total number of predictive variables, g(µ) is the link function between the mean of the response variable with the additive combination of predictive functions, β0 is the model constant, and fi(Xi) is the smooth function representing the relationship between the response variable and the predictive variable Xi.

A combination of smooth functions to obtain nonlinear and non-parametric relationships between parameters was constructed for the probability P of each cardiac disease. Experimental results yielded the following expressions:

$$\begin{array}{l}\mathrm P(\mathrm{bradycardia})={\mathrm\beta}_{\mathrm{bradycardia}}\:+{\mathrm f}_{\mathrm{RR}.\mathrm{Duration}}\;(\mathrm{RR}.\mathrm{Duration})+{\mathrm f}_{\mathrm{QTc}.\mathrm{Duration}}(\mathrm{QTc}.\mathrm{Duration})\\\mathrm P(\mathrm{ventricularHypertrophy})\;=\;{\mathrm\beta}_{\mathrm{ventricularHypertrophy}}\:+\:{\mathrm f}_{\mathrm Q.\mathrm{Wave}.\mathrm{Duration}}(\mathrm Q.\mathrm{Wave}.\mathrm{Duration})+{\mathrm f}_{\mathrm Q.\mathrm{Wave}.\mathrm{Height}}(\mathrm Q.\mathrm{Wave}.\mathrm{Height})+{\mathrm f}_{\mathrm R.\mathrm{Wave}.\mathrm{Height}}(\mathrm R.\mathrm{Wave}.\mathrm{Height})+{\mathrm f}_{\mathrm{ST}.\mathrm{Depression}}(\mathrm{ST}.\mathrm{Depression})+{\mathrm f}_{\mathrm T.\mathrm{Wave}.\mathrm{Sign}}(\mathrm T.\mathrm{Wave}.\mathrm{Sign})\\\mathrm P(\mathrm{myocardialInfarction})={\mathrm\beta}_{\mathrm{myocardialInfarction}}+{\mathrm f}_{\mathrm{PR}.\mathrm{Duration}}(\mathrm{PR}.\mathrm{Duration})+{\mathrm f}_{\mathrm R.\mathrm{Wave}.\mathrm{Height}}(\mathrm R.\mathrm{Wave}.\mathrm{Height})+{\mathrm f}_{\mathrm{ST}.\mathrm{Depression}}(\mathrm{ST}.\mathrm{Depression})+{\mathrm f}_{\mathrm{ST}.\mathrm{Elevation}}(\mathrm{ST}.\mathrm{Elevation})\\\mathrm P(\mathrm{ischemia})=\;{\mathrm\beta}_{\mathrm{ischemia}}+{\mathrm f}_{\mathrm{PR}.\mathrm{Duration}}(\mathrm{PR}.\mathrm{Duration})+{\mathrm f}_{\mathrm R.\mathrm{Wave}.\mathrm{Height}}(\mathrm R.\mathrm{Wave}.\mathrm{Height})+{\mathrm f}_{\mathrm{QTc}.\mathrm{Duration}}(\mathrm{QTc}.\mathrm{Duration})\:+\:{\mathrm f}_{\mathrm{ST}.\mathrm{Depression}}(\mathrm{ST}.\mathrm{Depression})\:+\:{\mathrm f}_{\mathrm{ST}.\mathrm{Elevation}}(\mathrm{ST}.\mathrm{Elevation})\:+\:{\mathrm f}_{\mathrm T.\mathrm{Wave}.\mathrm{Sign}}(\mathrm T.\mathrm{Wave}.\mathrm{Sign})\\\mathrm P(\mathrm{WPWSyndrome})\;=\;{\mathrm\beta}_{\mathrm{WPWSyndrome}}+{\mathrm f}_{\mathrm{PR}.\mathrm{Duration}}(\mathrm{PR}.\mathrm{Duration})\:+\:{\mathrm f}_{\mathrm{QRS}.\mathrm{Duration}}(\mathrm{QRS}.\mathrm{Duration})\:+\:{\mathrm f}_{\mathrm Q.\mathrm{Wave}.\mathrm{Duration}}(\mathrm Q.\mathrm{Wave}.\mathrm{Duration})\:+\:{\mathrm f}_{\mathrm Q.\mathrm{Wave}.\mathrm{Height}}(\mathrm Q.\mathrm{Wave}.\mathrm{Height})\:+\:{\mathrm f}_{\mathrm R.\mathrm{Wave}.\mathrm{Height}}(\mathrm R.\mathrm{Wave}.\mathrm{Height})\:+\:{\mathrm f}_{\mathrm{ST}.\mathrm{Depression}}(\mathrm{ST}.\mathrm{Depression})\:+\:{\mathrm f}_{\mathrm T.\mathrm{Wave}.\mathrm{Sign}}(\mathrm T.\mathrm{Wave}.\mathrm{Sign})\\\mathrm P(\mathrm{tachycardia})\;=\;{\mathrm\beta}_{\mathrm{tachycardia}}\:+\:{\mathrm f}_{\mathrm{RR}.\mathrm{Duration}}(\mathrm{RR}.\mathrm{Duration})\:+\:{\mathrm f}_{\mathrm{ST}.\mathrm{Depression}}(\mathrm{ST}.\mathrm{Depression})\end{array}$$

2.2.7 Settings & Statistics

This section allows adapting the interface to the needs of the user. The basic mode includes all the information and instructions available on the different screens. The expert user mode reduces the information available on-screen, to preserve a bigger and cleaner area for the contents. The information available for the current ECG record, and information of ECGDT and its current version can also be shown.

2.3 Graphical user interface

All functionalities of ECGDT are accessed through the GUI, which is organized in two differentiated areas, as shown in Fig. 3. The left panel contains the tool menu, and the right panel corresponds to the main part of the interface, with the work area. In ECGDT the menu is adaptive: it only shows the items available for the current contents. As an example, when the user accesses the system, only the options Load ECG and Settings are visible. The work area can be organized in tabs and contains the menu options.

Fig. 3
figure 3

ECGDT home screen

ECGDT is structured using client-server architecture, and users can access it through a web browser. ECG records can be uploaded to the server, and files created by ECGDT can also be downloaded to local storage. This decision was made taking into consideration the different configurations and low or outdated specifications of the computers used by medical professionals. With this approach, users do not have to install or configure anything on their computers. The only requirement is to have an Internet connection and any Internet navigator installed. Any further hardware or software requisites are not needed. An example of a use case of the system is also availableFootnote 1.

All files required for the server-side of ECGDT are available on GitHub, and accesible from the ECGDT webpage, in the R package RECGDT (R ECG Diagnosis Tool). Users who wish to use ECGDT should configure an RStudio environment inside a centralized server. R and RStudio should be installed, with the required package dependencies (i.e. the Shiny package). After that, the contents of the GitHub repository (the folder RECGDT, which includes required files and directories for the RECGDT RStudio project) should be downloaded and placed in a folder in the server. The project should be executed through RStudio and running in the server, so the remote users could use its functionalities via the Shiny interface. Initially, Shiny launches in a localhost address, being the apps accessible in the same computer of the environment. For the final users to access, the shiny-server configuration for the URL and port should be modified.

2.4 System Validation

To validate the system, both the diagnostic capabilities and the ease of use are to be evaluated.

For the first aspect, due to the need of looking for the presence of certain circumstances in specific channels, it was necessary to use a 12-lead-record database. To ascertain the diagnosis precision, St. Petersburg INCARTDB was selected. This database contains annotations for the conditions present on each of its records, even though not all the 75 records have any of the diseases detected by ECGDT. Forty-three of its records were suitable and used for the purposes of this work.

Evaluating a GUI is a complex task since it tries to measure subjective impressions, as attractiveness or ease of use. To quantify user experience, some approaches have appeared in the last years. One of these alternatives is the use of the User Experience Questionnaire (UEQ) tests [35], which estimate values for six abstract dimensions, based on the responses given by the users.

The questionnaire template is available on the web of UEQFootnote 2, and also a results spreadsheet, that automatically computes dimension values given the users’ responses.

ECGDT was presented to a diverse group of 15 users, with different levels of computer and cardiology expertise. After a brief tutorial, they were asked to try the tool for a larger period and to fulfill the questionnaire, once they have sufficient experience with the system to evaluate it.

3 Sample run

To illustrate ECGDT’s capabilities, the ECG record I17 from the St. Petersburg INCARTDB [11], was selected. It corresponds to a 64-year-old man with a transient ischemic attack, bradycardia, premature ventricular complexes, and right bundle branch block. ECGDT should be able to identify the presence of bradycardia and ischemia. At the same time, it should not raise an alarm for any of the other four evaluated conditions.

In the next subsection, a detailed explanation of the diagnosis process is presented.

3.1 Signal loading and exporting

An ASCII version of the original record is selected in the corresponding Load menu (Fig. 3). By clicking on Browse, the file can be searched and uploaded. A file browser window opens, where the user can locate inside the local computer the file of the desired record and select it. Once the file is selected, a progress bar will show the file uploading advance, the duration of the process depending on the file size and the Internet connection speed. When the server has correctly generated a copy of the local file, the progress bar will indicate the end of the upload through the Upload complete label. Signal voltage values recalibration is also possible. The Load ECG file button inserts file contents into the system variable ECGData, which stores record id, sampling frequency, number of leads, number of samples, patient’s information, date, and time. Fields for each lead are also present (signal values and positions for detected beats are stored). Another field is reserved for global beats, other keeps the diagnosed probability for each of the six evaluated diseases. Finally, one field is available to keep manual annotations.

The inverse process can also be performed in the Export ECG menu, obtaining an ASCII file with the record contents, by indicating a file name and path to store it.

3.2 Signal representation

The Show signal option allows users to select between continuous signal and segment representations. Each of these options is located in a tab in the work area. In Show all the signal, users can select the ECG lead they want to observe, from a list with all the leads available for the current record. By clicking on Show, the signal starts to refresh on the right side (Fig. 4). The user can also pause the signal on Stop and move forward and backward on Next > > and < < Previous. Continuous signal representation can also be resumed on Show.

Fig. 4
figure 4

Signal continuous representation. Within the ongoing representation, manual advance buttons are disabled, and only the Stop button is enabled

In the other tab, the user can select signal segment representation. In addition to the lead, segment limits should be indicated. The desired signal portion will appear on the right side, after clicking on Show. In this option, segments are limited to a minimum size of 10 samples and a maximum size of 10,000 samples. If the window size is exceeded, an error message will appear. If the value of the start sample is beyond the end sample, the system automatically exchanges the limits and plots the proper signal fragment.

3.3 Beat detection

Single-channel beat detection can be achieved through the Beat detection menu. By clicking on the Detect beats button, an algorithm marks positions for the beats present on each lead of the record. False positives filtering is also applied. These beat positions can then either be exported or plotted with Show detected beats.

In addition, with single-channel beat positions, ECGDT can compute more precise multi-channel beat positions. For this purpose, beats information obtained from all leads of the record is combined. This option is available in the Global beat detection menu, via the Obtain global beat positions button. Two new tabs include single-channel vs. multi-channel detection comparisons, both in continuous and segment formats (Fig. 5).

Fig. 5
figure 5

Detail of single-channel and multi-channel detection comparison

3.4 Average beat

An average beat can be created. This option obtains a representative beat for a specified segment. Users can select the channel, the central beat (from global detection positions), the milliseconds of signal to use before and after the QRS position, and the amount of previous and posterior beats to use.

3.5 Wave delineation

For this task, only the channel and the number of the beat must be provided. ECGDT presents this information to the user with a graphical representation of the beat (in grey), highlighting the extension of each event over the signal in a different color, and marking the positions of the peaks inside them (Fig. 6).

Fig. 6
figure 6

Example of wave delineation inside a beat

3.6 Diagnosis

In this particular example (record I17), ECGDT must indicate the presence of bradycardia and ischemia. The process starts when the user clicks on the Obtain a candidate diagnose for this patient button.

After executing the diagnosis process on the current ECG record, ECGDT correctly detects bradycardia and ischemia, with scores of 0.26 and 0.45 respectively (values over 0.20–0.25 indicate high probability). For the other diseases, lower scores are obtained (Fig. 7).

Fig. 7
figure 7

Diagnosis results obtained with ECGDT for record I17. The system correctly identified the two conditions present in the record

3.7 Settings & statistics

Users can alternate between basic and expert modes, by turning off and on the expert mode switch (Settings). The expert mode will reduce these instructions and hints to the minimum, to get a less cluttered interface with more space for the contents.

Furthermore, ECGDT collects the currently available information for the ECG record and shows it in a popup window. When a record is loaded into the system, this tool shows record data as the record ID, sampling frequency, or channels available (Fig. 8).

Fig. 8
figure 8

Record statistics for I17

The last option, About, shows a popup window with information about the latest version of ECGDT.

4 Results

In this Section, results for the two evaluation stages are presented.

4.1 Diagnosis tests

To test the diagnostic performance of the tool, 70% of the average beats from the 43 selected records were used for training and the remaining 30% for validation.

Particularly, bradycardia was correctly detected in 9 out of 9 cases (100.00%, mean AUC: 0.93), ischemia in 5 out of 7 cases (71.43%, mean AUC: 0.82), myocardial infarction in 9 out of 11 cases (81.82%, mean AUC: 0.72), tachycardia in 13 out of 13 cases (100.00%, mean AUC: 0.95), ventricular hypertrophy in 5 out of 8 cases (62.50%, mean AUC: 0.69), and Wolff-Parkinson-White Syndrome in 1 out of 3 cases (33.34%, mean AUC: 0.77).

4.2 Usability tests

Users’ responses to UEQ were collected and processed. The values obtained for the six UEQ dimensions are presented in Fig. 9.

Fig. 9
figure 9

Results for the six dimensions of UEQ after evaluation of ECGDT usability

Using UEQ tests, a GUI will receive a positive evaluation if all dimensions achieve values from 0.8 to 3.0. Values between − 0.8 and 0.8 reveal neutral feelings, and values below − 0.8 will indicate a poorly designed GUI. ECGDT has achieved values between 2.0 and 3.0 in all dimensions, emphasizing the users its attractiveness, efficiency, and dependability.

Users found ECGDT easy to use, with a pleasing interface and all the needed functionalities available and easy to access. They were also asked about the experience and possible improvements that could be included in the system. They pointed out some interesting details. One of these remarks was a possible modification of the colors’ scheme, to help users with visual disabilities (for example, color-blindness). Some users also pointed to the use of XML format for ECGDT files instead of the ASCII format used, arguing that it is a more open and extended file standard.

This users’ feedback will be taken into consideration in future versions of ECGDT, to improve the tool and adapt it to the needs of its potential users.

5 Discussion

CAD systems have proven themselves as valuable tools [3, 5, 8, 15, 33, 36]. Nevertheless, their use is not as extended as it could be expected. Due to the complex interface and inconspicuous results of the first systems of this type, some users are still reluctant to trust a computer in health-related tasks [49]. Another possible drawback is the use of personal criteria in the diagnosis, which sometimes is not easy to introduce inside the rules applied by the system.

In the present work, it may be noted that some of the conditions obtained lower values than expected. This is motivated by the reduced number of affected records. For example, Wolff-Parkinson-White Syndrome (detected in 1 of the 3 possible patients) could double its results by just being detected in one more case. In the future, will be intended to expand the study to a greater number of cases, to obtain more representative values.

The system presented in this paper can deal with six different common heart abnormalities. However, many other cardiac diseases exist. Even although ECGDT cannot currently cope of with other medical conditions, such as sleep apnea or heart valve complications, other diagnoses could be incorporated to the system. To do this, corresponding probability scores could be calculated from similar nonlinear classifiers, that should be fitted employing specific diagnostic rules and particular parameters for each disease. However, researchers should be very rigorous when establishing normality parameters for each medical condition, since it can be very complicated providing normality values to define the diagnostic rules. As an example, in the case of the sleep apnea, various investigations employing different methodologies yield very different and no comparable results for moderate-to severe apnea detection [16, 34, 51], and settling normality values can be a very daunting task. In addition, most of the works that study the presence of sleep apnea through the analysis of ECG records use the RR intervals, trying to detect certain trends related to increases in its duration. Taking into consideration that ECGDT already uses RR interval distances (through the variable RR.Duration), it could be a good starting point to adapt new diagnostic rules to create a new GAM model for sleep apnea diagnosis.

6 Conclusion

With the limitations of this type of systems in mind, ECGDT has been devised as an easy-to-use tool that analyzes an ECG record, searching for the presence of six main conditions at a time. A simple, yet useful GUI was also created, to simplify the access to its functionalities to less technical users. Access to some related tasks was added to the system, such as beat detection, signal representation, content export, or beat delineation. Diagnosis results show promising capabilities in the detection of bradycardia, ischemia, myocardial infarction, tachycardia, ventricular hypertrophy, and Wolff-Parkinson-White syndrome. The interface has also received highly positive feedback from the users who tested it.

Future work will address improvements in the detection results of the current conditions, and interface refinements. The addition of new functionalities and diagnostic capabilities for new conditions will be also considered.