Background

Although conventional resting electrocardiography (ECG) has an important role in managing acute coronary syndromes and suggestive but non-diagnostic acute chest pain, it has well-recognized limitations in the detection of heart disease[1]. For both isolated and pooled ECG abnormalities, the sensitivity of conventional resting ECG as a predictor for coronary artery disease (CAD) and left ventricular hypertrophy (LVH) has been too low for it to be practical as a screening tool[2, 3]. Although conventional resting ECG, when normal, has excellent negative predictive value (NPV) for left ventricular systolic dysfunction (LVSD), the simultaneously poor positive predictive value (PPV) of abnormal ECG findings also limits conventional ECG's utility in heart failure screening[4, 5]. Thus, any improvements to the resting ECG that might notably increase its sensitivity for identifying CAD and LVH (without compromising related specificity) and/or its PPV in screening for LVSD (without compromising related NPV) would be clinically relevant.

Over the past two decades, several advanced techniques implemented within software have improved the diagnostic and/or predictive value of resting ECG. These techniques have included derived "3-dimensional" (spatial/spatiotemporal) ECG;[69] high-frequency (HF) QRS ECG;[10] detailed studies of waveform complexity by singular value decomposition (SVD); [8, 1113] and beat-to-beat QT variability (QTV) [1417] and R-wave to R-wave variability (RRV)[18, 19]. A theoretical advantage of computerized ECG systems is that they allow for multiple conventional and advanced ECG techniques to be performed in software during a single digital recording. Related results can then be integrated automatically by using statistical pattern recognition techniques [20] to maximize diagnostic or predictive accuracy. In practice, these procedures can also be performed rapidly and relatively inexpensively.

The hypotheses of the present study were that a ~5-min resting 12-lead advanced ECG test ("A-ECG"), defined as the multivariate logistical integration of key results from both the conventional and advanced ECG, could detect common cardiac conditions such as CAD and concentric LVH with greater sensitivity and accuracy than optimized pooled criteria from the strictly conventional ECG and also predict LVSD with greater PPV and accuracy.

Methods

Participants

Data from all individuals who volunteered for resting ~5-min high-fidelity ECG studies from 2001 through mid-2007 (training set) or thereafter (test set) were considered for inclusion. These included data from: 1) Cardiac clinic patients who volunteered for individual studies at any of the following clinical sites: Texas Heart Institute (Houston, TX); the University of Texas Medical Branch (Galveston, TX); the University of Texas Health Sciences Center (San Antonio, TX); Brooke Army Medical Center (San Antonio, TX); St. Francis Hospital (Charleston, WV); the Universidad de los Andes (Mérida, Venezuela); and Lund University Hospital (Lund, Sweden); and 2) Asymptomatic individuals who volunteered as "controls" at any of the following sites: Johnson Space Center (Houston, TX); the Universidad de los Andes and Lund University Hospital. For the test set, additional data from patients whose ~5-min ECGs had been collected at the Charleston Area Medical Center as part of earlier studies but that became available to us during 2007 (i.e., the STAFF III database)[7] were also utilized. All participants gave original informed consent, and the Institutional Review Boards of one or more of the institutions approved the studies.

Inclusion criteria

For both the training and the test sets, to define our "Disease" groups, we included data only from those cardiac clinic patients whose disease (CAD, LVH and/or LVSD) was proven based on ECG-independent information derived from standard clinical imaging tests [16, 2123] performed within one month of ECG testing by investigators or other clinicians blinded to the automatically-produced A-ECG results. Disease was defined as the presence of at least one of the following: 1) CAD, defined as a coronary angiogram showing at least one obstruction ≥50% in at least one major native coronary vessel or coronary graft, or, if for clinical reasons angiography was not performed, then one or more reversible perfusion defects on 99 m (Tc)-tetrofosmin single-photon emission computed tomography (SPECT); [16, 21, 23]2) LVH, defined as moderate or greater concentric hypertrophy or concentric remodeling according to the guidelines of the American Society of Echocardiography;[24] and/or 3) LVSD of any etiology, defined as LVEF <50% by echocardiography, cardiac magnetic resonance imaging (CMR) or SPECT. Diseased individuals who met none of these three inclusion criteria but who had isolated right ventricular pathology, isolated LV diastolic dysfunction, isolated LV cavity enlargement or isolated fixed defect on SPECT were excluded from the study.

To derive correspondingly definitive "Healthy" groups for both the training and test sets, we included data only from low-risk asymptomatic controls who had no evidence of cardiovascular or other systemic disease based on a negative history and physical examination. Asymptomatic controls who were hypertensive (BP≥140/90), receiving treatment for hypertension, diabetic or active smokers were excluded. All cardiac clinic patients or asymptomatic individuals who had complete bundle branch block, sinus tachycardia, non-sinus rhythm, paced rhythm, pre-excitation, or an incomplete ECG recording were also excluded from both the training and test sets.

Training set

Of the 952 individuals who were considered for the training set, 708 met the above inclusion criteria, including 290 for the Disease group training set and 418 for the Healthy group training set. Of the 290 patients constituting the Disease group training set, 188 had normal LV function (136 had CAD; 25 had LVH; and 27 had both CAD and LVH) and constituted a "Disease without LVSD" training subset, whereas another 102 had LVEF <50% (77 with ischemic cardiomyopathy; 25 with nonischemic dilated cardiomyopathy) and constituted a "Disease with LVSD" training subset. Of the 418 controls in the Healthy group training set, a majority also had their disease-free status further demonstrated through normal or unremarkable results on a conventional or SPECT exercise stress test, echocardiogram, and/or CMR test performed for research purposes within 2 years of their ~5-min ECG. These included 55 elite, endurance-trained normotensive Swedish athletes (38 males) who had had clinically unremarkable CMR results.

Test set

Data for the test set were obtained from an additional 315 individuals, including from an additional 208 diseased patients and an additional 107 healthy controls. The 208 individuals in the Disease group test set consisted of 139 patients with CAD, 17 with concentric LVH, 11 with both CAD and LVH, and 41 with LVSD (27 with ischemic and 14 with nonischemic dilated cardiomyopathy). The Healthy group test set consisted of 107 consecutive individuals over age 35 (including 9 elite athletes) who met the Healthy group inclusion criteria, recruited after mid-2007 mainly at NASA's Human Test Subject Facility in Houston. Within the Disease Group test set, data for 97 of the 208 patients came from the pre-procedural portion of the STAFF III database[7]. Since all patients in the STAFF III database had catheterization-proven CAD but unreported LV function, their data, as well as data from another 26 diseased patients with unknown LV function were by necessity withheld from the LVSD-related sub-analyses in the test set.

ECG data collection and analyses

At all sites, a high-fidelity (1000 samples/sec) computerized 12-lead ECG system (Siemens-Elema AB, Solna, Sweden or CardioSoft, Houston, TX) was used to acquire at least 256 waveforms acceptable for signal averaging and variability analyses.

A. Conventional ECG parameters and criteria

Signals from the first 10 sec of the conventional ECG recording were analyzed automatically in software to quantify all major intervals, axes, and voltages as well as ST segment levels. Initial candidate criteria used for defining these strictly conventional 12-lead ECGs as "abnormal" were: 1) LVH according to traditional Sokolow-Lyon voltage criteria (SV1 + RV5 or RV6 ≥3.5 mV) or to gender-specific Cornell voltage (RaVL + SV3 ≥2.8 mV in men or ≥2.0 mV in women) or Cornell product (244 mV*ms with a 0.8 mV adjustment for women) criteria;[25]2) old infarction according to Anderson et al's subset of Selvester's criteria;[26]3) resting ST depressions or T-wave abnormalities according to computerized Minnesota Codes 4.1 to 4.4 and 5.1 to 5.3; 4) prolonged QTc (≥450 ms in men and ≥460 ms in women) or QRS (>110 ms) interval (individuals with complete bundle branch blocks being excluded from the study); or 5) left axis deviation (≤-30°).

B. Advanced ECG parameters obtained after signal averaging

Signal averaging was performed over the entire ~5-min (256-beat) recording using software developed by the authors[10, 13] to generate results for parameters of: 1) 12-lead HF QRS ECG;[10]2) derived 3-dimensional ECG, using the regression-related Frank-lead reconstruction technique of Kors et al[27] to generate several vectocardiographic parameters, including for example the spatial mean QRS-T angle,[6, 8, 28] the spatial maximums ("peaks") QRS-T angle[9] and the magnitude, [28] direction[28] and beat-to-beat variation[29] of the spatial ventricular gradient and its components; and 3) QRS and T-waveform complexity via SVD, to derive for example the principal component analysis (PCA) ratio,[11, 13, 30] the relative residuum[12, 13] and the dipolar and nondipolar voltage equivalents[8] of the QRS and T waveforms. The majority of these parameters and their related detailed methods have been described in other recent publications[10, 13, 31]. We also generated results for several other potentially promising parameters (see Additional file 1: Supplemental Table 1 for partial list), including, for example, for the spatial ventricular activation time [32] and the total integral of the Z-lead QRS complex above 5 Hz ("Z integral")[33].

C. Advanced parameters derived from variability analyses

Several parameters of 256-beat RRV and QTV described in previous publications[17, 31, 34] were again evaluated via custom software programs[17]. These included the QT variability index (QTVI), but using the means and variances of the RR interval[15] rather than those of the heart rate[14] in the denominator of the QTVI equation, and the "unexplained" part of QTV[31, 34].

Statistical Analyses (including generation, validation and testing of A-ECG scores)

Using the training set, promising candidate subsets of ECG parameters for potential inclusion in primary ("Healthy versus Disease") and secondary ("Disease with versus without LVSD") A-ECG scores were first identified using a branch-and-bound feature selection procedure [20] implemented in SAS 9.1.3 (Cary, NC). To avoid the so-called "curse of dimensionality", the number of ECG parameters incorporable into any potential A-ECG score was limited to fewer than one-tenth of the minimum number of training samples available in a given group or subgroup[20]. Logistic regression was used to retrospectively estimate the probability of any subject in the training set being a member of the Disease group, and of any diseased subject in the training set being a member the "Disease with LVSD" subgroup, based strictly on his/her A-ECG-based independent variables and a cutoff predicted probability of >0.5. The best candidate subsets of parameters (A-ECG scores) were then further validated by bootstrap analysis[35] in which for each fixed score, the data were iteratively resampled 1000 times and the logistic regression coefficients for each parameter in the given score re-estimated. The bootstrap analyses, implemented in Stata 10.0 (College Station, TX), revealed not only the variability in the coefficients, but also those candidate A-ECG scores that should be discarded because of their doubtful utility for classifying later subjects in the test set, for example scores with coefficients that varied greatly or that did not have the expected sign over all 1000 bootstraps. Prior to subsequent evaluation in the test set, the bootstrap-validated A-ECG scores were further evaluated within the training set via a jackknife procedure[35] in which the score's sensitivity, specificity, accuracy or predictive values were assessed by using the data for all but one observation in the training set to classify the omitted observation, then repeating the process for each observation in turn. Comparisons of accuracies (sensitivities and specificities) and predictive values between strictly conventional and A-ECG classifiers were performed using Cochran's Q[36] and Wald tests, respectively, the latter employing the difference-based weighted least squares method[37]. For simple illustrative comparisons between groups, the Wilcoxon rank sum and receiver operating curve characteristic statistics were used.

Results

Table 1 shows baseline characteristics of the Disease and Healthy groups and of the two Disease subgroups in the training set. Besides being free of hypertension and diabetes and having lower body mass index and medication use, the Healthy group training set also was younger than the Disease group training set. Because of the age disparity, two sets of primary A-ECG scores were constructed and validated in the training set for later evaluation in the test set: one wherein all healthy subjects were included and one wherein only those healthy subjects >40 years of age (mean 51 ± 8 years, N = 133, 63% men) were included. For the additional 315 individuals who comprised the test set, the distributions of hypertension, diabetes, body mass index, LVEF, and medication use were similar to those shown in Table 1. The mean ages in the test set were 59 ± 12, 59 ± 13, 56 ± 12 and 49 ± 11 years for the Disease group, the with-LVSD and without-LVSD Disease subgroups and the Healthy group respectively, men comprising 65%, 74%, 60% and 57% of those groups, respectively.

Table 1 Demographic Characteristics of the Training Set Disease Group, Disease Subgroups and Healthy Group

Figures 1 and 2 show how the performance of an A-ECG score in the training set depended on the number of ECG parameters the score incorporated. For primary ("Healthy versus Disease") A-ECG scores (Figure 1, N = 708), only negligible further gains in cross-validated accuracy occurred with scores containing more than ~9 parameters. For secondary ("Disease with versus without LVSD") A-ECG scores (Figure 2, N = 290), this same cutoff occurred at only ~5 parameters. The first and second parameters incorporated into primary A-ECG scores by the automatic selection procedures were the QTVI in lead II and the spatial mean QRS-T angle, respectively (Figure 1). For secondary (LVSD) A-ECG scores, the first and second parameters incorporated by the same procedures were the Z integral and the spatial mean QRS-T angle, respectively (Figure 2).

Figure 1
figure 1

Effect of number of parameters in a primary ("Healthy versus Disease") Advanced ECG (A-ECG) score on the score's jackknifed accuracy in the training set (N = 708).

Figure 2
figure 2

Effect of the number of parameters in a secondary ("Disease with versus without left ventricular systolic dysfunction", LVSD) Advanced ECG (A-ECG) score on the score's jackknifed accuracy in the training set (N = 290).

Table 2 shows the performances in the training set of the pooled, strictly conventional ECG criteria, along with those of the most relevant single parameters and A-ECG scores. The candidate conventional ECG criteria outlined in the Methods section were retrospectively optimized when their Sokolow-Lyon subcriteria were dropped and replaced instead by subcriteria for left atrial abnormality (P-wave duration >120 ms or terminal negative component of a biphasic P-wave in lead V1 >4 ms*mV in area). Thus, only the resulting optimized set of conventional ECG criteria was carried forward for later use with the test set. Not unexpectedly, the retrospectively optimized A-ECG scores outperformed the retrospectively optimized pooled conventional ECG criteria in the training set. Of note, the optimal primary A-ECG scores made use of the entire ~5-min (so called "full-disclosure") 12-lead recording because they incorporated results from QTVI (Figure 1). Inasmuch as most 12-lead ECG machines do not yet have full-disclosure capabilities, Table 2 also shows the diagnostic performance in the training set of an optimized primary A-ECG score that was only allowed to incorporate results from parameters likely yielding reliable and reproducible results within strictly "snapshot" (10-sec) ECG recordings.

Table 2 Accuracies and Predictive Values of Pooled Conventional versus A-ECG Criteria in the Training Set

Table 3 shows the performances in the test set of the optimized pooled conventional ECG criteria and of the most relevant single parameters and A-ECG scores generated in the training set. Although as expected most A-ECG scores tended to have slightly diminished performance in the test set compared to the training set (compare Table 3 to Table 2), several primary A-ECG scores generated from the training set still had accuracies of 90% or greater in the test set. For example, compared to the optimized pooled criteria from the strictly conventional ECG, the best 7-parameter primary full-disclosure A-ECG score generated in the training set increased the sensitivity of resting ECG for identifying Disease in the test set from 78% (72-84%) to 92% (88-96%) (P < 0.0001) while also increasing specificity from 85% (77-91%) to 94% (88-98%) (P < 0.05). Another 7-parameter A-ECG score that only incorporated parameters likely yielding reliable and reproducible results within "snapshot" ECG recordings was only slightly less accurate. In diseased patients, another 5-parameter secondary A-ECG score generated in the training set also increased the PPV of ECG for additionally predicting LVSD in the test set from 53% (41-65%) to 92% (78-98%) (P < 0.0001) without significantly compromising NPV. This secondary A-ECG score had corresponding positive and negative likelihood ratios for LVSD in the test set of 12.16 and 0.18, respectively, versus 1.23 and 0.21 for the optimized pooled conventional ECG criteria. The exact components and coefficients of those training set-generated primary and secondary A-ECG scores that performed best in the test set are shown in Additional file 2 (Supplemental Table 2).

Table 3 Accuracies and Predictive Values of Pooled Conventional versus A-ECG Criteria in the Test Set

Discussion

The results of this study suggest that resting 12-lead A-ECG tests can detect the presence of catheterization-proven or other imaging-proven CAD and LVH with higher sensitivity and specificity than optimized pooled criteria from the strictly conventional ECG. This improved detection is accomplishable via the use of optimal combinations of 7 or fewer advanced and conventional ECG parameters within computerized multivariate A-ECG scores. Similar A-ECG scores can also increase the PPV of resting ECG for predicting LVSD without compromising related NPV.

Beginning in the 1960s, Pipberger et al applied a multivariate approach to the conventional ECG (orthogonal ± 12-lead) to obtain excellent diagnostic accuracies, albeit generally only for those conditions considered classically diagnosable by ECG, such as ventricular hypertrophy and previous infarction[38, 39]. Our results therefore confirm Pipberger et al's suggestion that the diagnostic utility of resting ECG could be continuously improved through computer-automated multivariate analyses validated against ECG-independent diagnostic information. Our results also suggest that the use of 21st-century software technology can now extend the reach of resting ECG toward the detection of conditions previously thought not to be detectable by it, for example CAD without prior infarction. The basic premise of A-ECG is that a ~5-min resting 12-lead recording contains sufficient information, if assiduously sought, to allow gross detection of most cardiac pathology. Although the ECG equipment used in this study was "high fidelity," the best-performing A-ECG scores likely did not require such equipment and thus should also be derivable from many "standard-fidelity" ECG devices.

The present results suggest that with further validation, resting A-ECG might join other methods that are presently recommended [40, 41] or suggested [42] as initial tests for individuals at intermediate pretest epidemiologic risk for CAD. The main advantages of A-ECG are that it can be performed rapidly and inexpensively, including in patients who are unable to exercise, and it does not expose patients to the potentially health-compromising effects of radiation. The convenience of A-ECG is also high in that the majority of individuals who are clinically discerned as being at intermediate pretest epidemiologic risk for CAD will likely have resting 12-lead ECGs anyway. Finally, A-ECG also has a multifunctional aspect in that it can potentially aid screening not only for CAD but also for LVH and LVSD.

Heart failure is an increasing and expensive problem worldwide. Because the adequate and timely treatment of LVSD can reduce mortality and frequency of hospitalizations,[43, 44] it would be beneficial if a simple resting ECG could serve as a reasonably accurate initial screening test. Although our results corroborate the findings of others that a normal resting conventional 12-lead ECG has a very high NPV for LVSD (typically >95% in individuals with suspected heart failure in the general population),[4, 5] conventional ECG's predictive value for LVSD is nonetheless limited by its simultaneously poor PPV (typically ≤35% in the same studies)[4, 5]. Notably, in our study, the best secondary A-ECG scores nearly doubled the PPV of resting ECG for LVSD without compromising NPV. Given that the use of A-ECG therefore mitigates resting ECG's principal weakness in LVSD screening (poor PPV) and the fact that even conventional ECG alone sometimes outperforms other proposed modalities for LVSD screening such as B-type natriuretic peptide,[5] A-ECG might serve as a useful adjunct to conventional ECG and natriuretic peptides in heart failure screening, particularly for better guiding referrals to more definitive but costly echocardiography tests.

Limitations

We did not nominally allow age, a continuous parameter that correlates with many ECG changes,[45] to be incorporated into A-ECG scores. We took this approach not only because the incorporation of age might ultimately compromise the ability of A-ECG to detect disease in younger individuals, but also because our principal aim was to compare the performance of A-ECG to that of optimized, strictly conventional ECG criteria that likewise do not incorporate age. While we are able to construct primary A-ECG scores that incorporate age and that, compared to the primary A-ECG scores described herein, have non-significantly increased accuracy in the full training set (where age differences between Healthy and Disease groups are greatest), these same age-incorporating scores also have non-significantly decreased accuracies in the arguably more important test set. Similarly, while we're also able to construct A-ECG scores on a gender-specific basis, doing so does not statistically significantly improve performance for either gender, neither in the training set nor in the test set. This finding may relate to the fact that the best performing non-gender specific scores all contained at least one parameter known to have higher values in men, for example spatial QRS-T angle,[46] balanced by at least one parameter known to have higher values in women, for example a measure of T-wave complexity[47]. Clearly, however, the use of gender-specific A-ECG scores validated in larger data sets might further optimize performance in the future.

Because our hypothesis involved assessing the relative performance of A-ECG versus conventional ECG and we were not able to assess coronary microvascular function, [48, 49] we excluded from our "Healthy" groups asymptomatic diabetics, hypertensives and smokers as well as all individuals with angina or subclinical CAD (luminal stenoses <50% by catheterization). From the perspective of assessing absolute performance this might of course be construed as a limitation. To therefore further address this issue, we have proceeded to analyze the ECG data from these excluded higher risk individuals (N = 136; 55 ± 11 years, 51% females), the results revealing that just over one half (69/136) would have had positive optimized conventional ECG criteria for "Disease" and just over one-third (49/136; 51/136) positive full-disclosure and snapshot primary A-ECG scores, respectively. Thus, had we simply ignored any possible effect of subclinical CAD on the ECG (in spite of evidence to the contrary [50]) and just assigned these higher risk individuals to our "Healthy" group, not only would the specificity of all ECG testing been decreased, but the specificity of the primary A-ECG scores would also have been further increased relative to that of the optimized, strictly conventional ECG criteria. Additional studies, ideally using direct physiological assessment of coronary arteries [49] as the gold standard, are therefore required to determine whether any clinical importance should be attached to the modestly lower prevalence of A-ECG compared to conventional ECG abnormalities in these higher risk individuals.

Nearly all our patients with LVSD had experienced symptoms and had begun medical therapy prior to their ~5-min ECG tests. Therefore, although we demonstrated that A-ECG scores have better predictive value for medically-managed LVSD than do pooled criteria from the strictly conventional ECG, the ability of A-ECG to better predict pre-symptomatic LVSD was not directly tested and requires further study. Additional study limitations include the grouping together of CAD and LVH (keeping in mind that these conditions are commonly co-morbid plus the more important fact that a high suspicion of either during initial screening would prompt further characterization through imaging); the relatively small number of patients with isolated LVH in the test set; the absence of a larger prospectively studied test group without prior known Disease; and the use of multiple different imaging modalities. Finally, we have not studied the prognostic utility of A-ECG scores. Additional studies are therefore required to determine whether A-ECG scores can further augment the known prognostic utility of certain of their key constituent parameters [6, 8, 9, 11, 12, 15].

Conclusions

Resting 12-lead ECG tests that combine 7 or fewer advanced and conventional ECG parameters within computerized A-ECG scores are more accurate than optimized pooled criteria from the strictly conventional ECG in detecting obstructive CAD and concentric LVH and in screening for LVSD in individuals with known cardiac disease.