Background

Chest pain remains one of the most common presenting complaints in patients presenting to emergency departments (ED), with >100,000 patients being hospitalized each year in Australia with acute coronary syndromes (ACS) [1]. The morbidity, mortality and economic costs associated with this constitute a significant burden on the Australian health system [2]. In these patients, use of risk stratification tools to predict risk of death, cardiac complications and the pre-test probability of ACS has been demonstrated to aid clinicians in appropriate prioritizing of patients for investigations [3, 4] and assists in identification of those at higher risk who might benefit most from potent drug therapies or an early invasive therapeutic approach [510]. Risk stratification tools have further been shown to allow patients to be better informed of their prognosis [7], improve cost-effectiveness while minimizing unnecessary treatment complications [11] and reduce unnecessary admissions to inpatient monitored beds, without increasing complications, thereby potentially having a positive impact on access block [3, 6, 12, 13].

Recently, in Queensland, a suite of clinical pathways for the management of patients presenting to public hospitals with chest pain was developed [14], with application incentivized by practice improvement payments [15]. These pathways rely on risk stratification utilizing the National Heart Foundation of Australia risk stratification tool (HFA), perhaps the most prominent risk stratification utilized in Australia [16]. Although the individual components of this tool are evidence based and the tool was developed by consensus of an expert panel, it was designed for risk stratification of patients with ACS rather than for the undifferentiated ED chest pain population. There are conflicting data regarding its performance in ED chest pain populations [17, 18].

The aim of this prospective observational study was to compare the performance of three methods of risk stratification, namely, the National Heart Foundation of Australia risk stratification tool (HFA) [16], the Goldman score [6] and the Thrombolysis in Myocardial infarction (TIMI) risk score [19] for prediction of a composite outcome of major adverse cardiac events (MACEs) within 30 days of ED attendance.

Methods

This was a prospective, cohort study undertaken at a single tertiary referral ED. Patients presenting to the ED with non-traumatic chest pain during the preceding 48 h and aged >30 years were eligible for inclusion in the study. Exclusion criteria were the presence of a definitive non-ischemic cause for chest pain, isolated angina-equivalent symptoms, trauma-related chest pain, cardiac arrest on arrival to the ED, patients with ECG criteria for ST-elevation myocardial infarction (MI) on arrival to the ED and inability to provide informed consent.

Data collection occurred during the ED presentation on weekdays that a trained research nurse was available. Follow-up was undertaken by both review of medical records utilizing a standardized data collection tool and a phone call to the patient (or proxy if the patient was not contactable) employing a structured interview at 72 h and 30 days. A single emergency physician, blinded at the time to the patient outcomes, retrospectively undertook the risk stratification process using the prospectively collected data items and initial ED-acquired electrocardiogram.

All patients included in the study had their cardiac risk determined by each of three methods of risk stratification utilizing findings on presentation to the ED (see Table 1). The HFA [16] and Goldman tools classify patients into risk groups with nominal descriptors (e.g., high, low), while the TIMI risk tool derives a score out of seven [9, 17].

Table 1 Risk stratification tools

The primary outcome of interest was MACE within 30 days of ED presentation. MACE components were defined utilizing the American College of Cardiology Clinical Data Standards definitions [20] and included acute myocardial infarction (prevalent and incident), recurrent ischemia requiring urgent revascularization, cardiogenic shock, ventricular arrhythmia requiring emergent intervention or high-grade atrioventricular block requiring treatment, cardiac arrest and all-cause mortality.

For analysis, continuous variables with normal distribution were expressed as medians and interquartile ranges; categorical data were presented as percentages. Group differences in continuous and categorical variables are compared with Kruskal-Wallis and chi-square tests respectively. For each of the risk stratification methods, ROC curves were used to evaluate its predictive performance. Area under the curve (AUC) was utilized as a summary measure for diagnostic accuracy of the prediction tools across the gamut of risk groups [21], with 95% confidence intervals (CI). For the comparison of clinical performance of the risk scores, we chose to include AHA high risk, Goldman high risk and two cutoffs of the TIMI score (≥1 and ≥2). The latter were chosen pragmatically a priori and attempted to balance case discrimination and sensitivity. Inclusion of patients with a TIMI score of zero provides no case discrimination, while using a cutoff of ≥3 has been shown to have a sensitivity <60% [22].

For all comparisons a p-value of <0.05 was considered statistically significant. Statistical analysis was performed with Stata, version 10 (College Station, TX, USA). Power calculations were generated as follows: the AUC of the TIMI score has previously been demonstrated to be 0.6 [23], while the Goldman score has been shown to have an AUC of 0.9 [3, 6]. The performance of the HFA score was expected to be similar to that of the Goldman. Assuming a correlation between positive and negative groups of 0.4, a sample size of 400 patients was required to distinguish an AUC of 0.85 from one of 0.9, at a p value of 0.05 with an 80% power. An interim analysis was performed as approved study duration and funding were nearing their end. The study was terminated early as this interim analysis revealed a clearly statistically significant result. The institutional ethics committee approved the study.

Results

Two hundred eighty-one patients were studied with 276 completing 30-day follow-up. The median age of the study group was 56 years (IQR 47.5-66), with the majority (61.5%) being male (Figure 1). Patient characteristics are summarized in Table 2. Of the 276 patients with 30-day follow-up, 39 (14.1%) had a MACE.

Figure 1
figure 1

Patient enrollment.

Table 2 Patient characteristics

The predictive performance of the risk stratification tools is shown in Figure 2. AUC for prediction of MACE was poorest for the HFA tool, with an AUC of 0.54 (95% CI 0.45-0.63). The TIMI risk score had the highest AUC of the three tools tested, with an AUC of 0.71 (95% CI 0.63-0.79), while the Goldman tool had an AUC of 0.67 (95% CI 0.57-0.77). The difference between the tools in predictive ability for MACE was highly significant (p = 0.0002). There was no statistically significant difference in performance between the Goldman tool and TIMI score. The sensitivity and specificity of the tools are summarized in Table 3.

Figure 2
figure 2

Predictive performance of risk stratification tools.

Table 3 Comparative performance of risk stratification tools

Utilizing the chi-square test to compare the AUC for the TIMI, Goldman and HFA tools, the difference between the tools in terms of predictive ability for MACE was found to be highly significant (chi2, N = 276, p <0.0001). Further analysis showed that the HFA tool performance was different from that of each of the other tools, which were similar in performance.

Discussion

In this study of ED patients with chest pain, where no alternative non-ACS cause was apparent, none of the tools under investigation were ideal. Characteristics of an ideal tool for stratifying risk of chest pain in the ED population would be that it has high sensitivity for risk of MACE, specificity sufficient to enable feasible application and, from a patient perspective, limit exposure to unnecessary investigations or interventions. It would also have data elements that were non-subjective or, at least, had high interobserver reliability. The TIMI risk score had the best performance in stratifying risk for MACE. However, utilizing a cutoff of ≥1 to achieve a sensitivity of 97% resulted in a specificity of only 13%. A higher cutoff of >3 would be required to achieve a more acceptable specificity of 60%; however, the resultant sensitivity of 74% is unacceptable. The sensitivity of the Goldman score (69%) was insufficient to be useful in the ED population, even with the lowest possible cutoff, namely including all patients with a risk higher than ‘very low’ risk. The HFA risk stratification tool had an AUC of 0.54, with 95% confidence intervals (0.45-0.63) disturbingly encompassing an AUC of <0.5. Its sensitivity for MACE, using a cutoff to include all high-risk patients, was 100%, but clearly with a specificity of 8%, its feasibility in ED clinical practice is limited.

The use of an unstructured or individualized approach to ED assessment of chest pain has been shown to be associated with high resource utilization for patients with no coronary artery disease, while concurrently resulting in significant proportions of patients with ACS being missed [24]. Subsequently, emphasis has been placed on utilization of risk stratification tools. Unfortunately, studies employing risk stratification tools for chest pain in the ED setting often have two significant limitations. First, the tools employed have largely been designed for risk stratification of those with known ACS, usually patients admitted to the hospital. Utilizing these tools in the undifferentiated ED patient with chest pain may have a significant impact on both safety and efficiency. Second, many trials of chest pain risk stratification have tested the tools on ED populations where their chest pain is “thought to be of ischaemic origin.” However, as physician discrimination of ACS from other causes of chest pain has previously been shown to be poor, this may result in potentially flawed inferences [24].

The TIMI tool is widely reported as being utilized in the undifferentiated chest pain population in the ED, including as part of rapid diagnostic protocols [25]. In this study, TIMI had comparable sensitivity and specificity for MACE to that found in previously published studies [25], but again highlights that the selection of an appropriate cutoff, balancing sensitivity, specificity and clinical feasibility, is problematic.

Earlier studies identified a higher sensitivity for the Goldman tool in the ED setting than our results [3, 26]. This disparity may be accounted for by the fact that these studies differed in their inclusion criteria, with a focus on patients admitted to the hospital, including patients with ST elevation MI, and/or had much earlier follow-up for the endpoint of MACE. Additionally, both the definition of MI and the available cardiac biomarkers have changed considerably since these earlier studies.

The HFA risk stratification tool had an AUC of 0.54. Its high sensitivity for MACE (100%) came at the cost of a specificity of only 8.4%, reflecting classification of 93% of patients as being at “high risk.” This finding is concordant with a previously published study, which similarly questioned the HFA risk stratification tool’s suitability for use in this patient cohort [18]. If the HFA decision support tool [16] were applied to the patient population in this study, it would lead to 93% of ED chest pain patients being admitted to hospital wards and receiving treatments such as heparin. The weight of evidence suggests that the HFA risk stratification tool is inappropriate for use in chest pain pathways in the unselected ED chest pain population.

Other approaches to risk stratification of ED chest pain patients have recently been reported. The HEART score [4, 27] was developed in The Netherlands based on clinical experience and literature review rather than database methods. It has five components – history, ECG, age, risk factors and troponin level – which are each rated 0, 1 or 2 based on criteria. In a validation study, it was shown to have better discriminative performance than the GRACE and TIMI risk scores [27]. In multicentre validation, a low HEART score (≤3) had a 1.7% rate of MACE [28]. Of concern is that this score relies in part on subjective assessment of likelihood of ACS for which inter-rater data are scarce.

Another approach was taken by the GRACE investigators with the development of a score aimed at predicting the absence of MACE. The GRACE freedom from events score [29] was developed in an admitted chest pain cohort with likely ACS and has undergone limited external validation on admitted chest pain cohorts [30, 31]. No validation in an ED chest pain cohort has yet been published.

A similar approach has been taken in the development of the North American Chest Pain Rule (NACPR) [32]. It aims to identify low-risk ED chest pain patients suitable for early discharge. The rule consists of the absence of five predictors—ischemic ECG changes not known to be old, history of coronary artery disease, pain typical for ACS, initial or 6-h troponin level greater than the 99th percentile and age greater than 50 years. In internal validation, it was 100% sensitive (95% confidence interval 97.2% to 100.0%) and 20.9% specific (95% confidence interval 16.9% to 24.9%) for a cardiac event within 30 days, with 11% of patients being defined as low risk [32]. Its utility has been challenged in an external validation study and comparison to the HEART score [33]. The NACPR identified 4.4% (95% CI 3-6%) for early discharge with 100% (95% CI 98-100%) sensitivity for ACS, while the HEART score identified 20% (95% CI 18-23%) for early discharge with 99% (95% CI 97-100%) sensitivity for ACS. The low proportion of patients identified as low risk in a population with MACE of 22% is a serious threat to this score’s clinical utility. That said, the approach of trying to decide who is safe for discharge rather than identification of high risk is worthy of further exploration.

This study has some limitations that should be considered in interpreting the results. Bias in the verification of events is possible if the inability to contact patients in follow-up coincided with those having events. The potential for verification bias has been minimized by follow-up not only via chart review but also via admissions registry review and with the patient or proxy. While most data were collected prospectively, some were collected retrospectively and so are subject to potential data omission. Some of the data were reliant on patient self-report (e.g., of past history and risk factors). If available in the chart, this was verified; however, otherwise, no attempt was made to verify these data, reflecting the real-world ED clinical interaction. The study was conducted at a single site, and this may have impacted the external validity.

Conclusions

The TIMI risk stratification score appeared most suitable for use in an undifferentiated ED chest pain population, but selection of an appropriate cutoff is problematic. This study highlights the need for validated risk stratification tools for the ED chest pain cohort that examine not only their safety in terms of their sensitivity, but also their flow and efficiency impacts, as these factors have significant implications for safety for all ED patients.