Introduction

Acute pulmonary embolism (APE) is a hazardous condition with a high mortality rate [1, 2]. Therefore, a prompt diagnosis of APE is crucial. The current gold standard for the diagnosis of APE is computer tomographic pulmonary angiography (CTPA). Additionally, an immediate risk assessment of patients with APE upon presentation is important. Various clinical, serological, and imaging factors have been found to have prognostic value in predicting outcomes for patients with APE. For instance, elevated troponin levels independently contribute to the prognosis of short-term and long-term outcomes [2]. Similarly, serum lactate levels serve as a prognostic factor for short-term complications related to pulmonary embolism [3]. Patients with high lactate values have a higher mortality rate [3]. Furthermore, several clinical parameters have been identified as significant prognostic factors in APE. Survivors tend to have higher mean and systolic arterial pressures compared to non-survivors [4]. In terms of imaging, different CTPA parameters hold prognostic value for morbidity and/or mortality in patients with APE [5,6,7]. For instance, increased ratio of right ventricle diameter to left ventricle diameter (RV/LV ratio) measured on transverse CTPA images is associated with a high risk for all-cause mortality [7]. Contrast reflux in to the inferior vena cava is another strong prognostic factor for short-term mortality in patients with APE [6].

Several studies suggest that combining radiological, clinical, and serological parameters can enhance the risk assessment of APE patients [8,9,10,11,12]. The reported combined scores demonstrate high diagnostic value [8,9,10,11,12]. However, most of the proposed scores lack validation on independent cohorts, leaving the true prognostic potential of these scores uncertain.

The objective of the present study was to validate the previously proposed combined clinical-radiological scores, as reported in the literature, for predicting 7- and 30-day mortality in APE. The validation was conducted using a large cohort of patients.

Methods

This retrospective study was approved by the institutional review board (number: 58/22, Ethic Commission of the Medical Faculty, Otto-von-Guericke University Magdeburg).

Patients

For the current study, a screening of the electronic databases of the radiological department was conducted, encompassing the time period from 2015 to 2021. All patients diagnosed with acute pulmonary embolism were extracted from the database. In total, 571 patients were identified as meeting the criteria for inclusion in the study. The inclusion criteria were as follows:

  • Age ≥ 18 years.

  • Evidence of APE on CTPA.

  • Available CTPA images in the picture archiving and communication system (PACS).

  • Available clinical data (age and gender, minimal systolic and diastolic blood pressures, heart rate, episodes of syncope, and need for vasopressors).

  • Available biochemical data including pH, O2 saturation, and troponin.

Exclusion criteria were:

  • Incomplete visualization of pulmonary arteries on CT scan.

  • Chronic PE.

Clinical data was collected immediately before or after CT scan.

Follow-up was performed by electronic hospital charts. Overall, 40 patients were excluded. Our sample comprised 531 patients with APE, 221 (41.6%) females and 310 (58.4%) males, mean age 64.8 ± 15.6 years, median age 66 years. Baseline demographics and clinical characteristics are given in Table 1.

Table 1 Patients demographics and scores

In all cases, the diagnosis of APE was confirmed using CTPA. The CTPA scans were performed on multi-slice CT scanners, specifically Siemens Somatom Definition AS+ (Siemens Healthcare, Germany) or Canon Aquilion Prime (Canon Medical Systems, Ottawara, Japan). To enhance the visibility of the blood vessels, an iodinated contrast agent was administered intravenously via a peripheral venous line at a rate of 3.0–4.0 ml/s. Automatic bolus tracking was utilized in the pulmonary trunk, with a trigger set at 100 Hounsfield units (HU) to initiate the scanning process. All patients received non-invasive medical treatment for APE.

For the purpose of the present study, the following radiological parameters were measured: the short-axis ratio of the right ventricle to the left ventricle (RV/LV), and the presence of contrast medium reflux into the inferior vena cava (IVC). The diameters of the right and left ventricles were estimated by identifying the largest points between the inner margins of the interventricular septum and the ventricle wall (Fig. 1). The reflux of contrast medium into the IVC was assessed on axial and coronal images, and it was quantified using a 4-point scale: no reflux (0 points), subcardial reflux into the IVC (1 point), intrahepatic reflux in the IVC (2 points), and subhepatic reflux in the IVC (3 points) (Fig. 2).

Fig. 1
figure 1

Measure of the RV/LV diameter ratio

Fig. 2
figure 2

Estimation of the reflux into the inferior vena cava

  1. a.

    No reflux into the inferior vena cava.

  2. b.

    Grade 1: Reflux into the suprahepatic inferior vena cava only.

  3. c.

    Grade 2: Reflux into the intrahepatic inferior vena cava as well and into the hepatic veins.

  4. d.

    Grade 3: Infrahepatic reflux.

All images were available in digital format and were analyzed on a PACS (picture archiving and communication system) dedicated workstation (Infinitt PACS, Version 3.0, Infinitt Healthcare, Korea). All images were analyzed in consensus by 2 radiologists with 3 and 20 years of radiological experience.

The following clinical-radiological scores were calculated (Table 2): BOVA score, pulmonary embolism mortality score (PEMS), European Society of Cardiology 2019 (ESC) score, Kumamaru score, and Calgary acute pulmonary embolism (CAPE) score [8,9,10,11,12]. Additionally, sPESI as a standard risk score was also calculated [13]. It is part of the risk stratification of the ESC guideline and PEMS.

Table 2 Analyzed combined clinical-radiological scores

Statistical analysis

Statistical analysis was performed using SPSS (version 28, IBM SPSS Statistics for Windows, Armonk, NY, USA: IBM corporation). The collected data were evaluated by means of descriptive statistics (absolute and relative frequencies for categorical variables, means and standard deviations for continuous variables).

For every calculated score, sensitivity, specificity, negative and positive predictive values, Youden index, and area under the curve values were calculated for prognosis of 7- and 30-day mortality. For pairwise comparisons between the scores, kappa coefficients and McNemar test were reported. The in the literature suggested reported cut offs were used for the calculation.

Results

Overall, 64 patients (12.0%) died and 467 patients (88.0%) survived within the 30-day observation time. 31 patients (5.8%) died within 7 days. The results of the analysed scores for survivors and non-survivors are shown in Tables 3 and 4. Furthermore, prognostic values of the scores to predict mortality were evaluated by using the reported cut-offs for every score. As shown, PEMS and CAPE had the highest specificity for 7-day mortality (93.4% and 85.0%, respectively). ESC 2019 showed the highest sensitivity (96.8%), followed by sPESI (90.3%). For 30-day mortality, PEMS and BOVA showed the highest specificity of 94.2% and 90.4%, respectively. The highest sensitivity was observed for ESC 2019 (95.3%), followed by sPESI (85.9%). All scores showed high negative predictive values and low positive predictive values for 7- and 30-day mortality.

The pairwise comparisons between the (dichotomized) scores show only moderate agreements (Table 5). Solely sPESI and ESC 2019 are an exception with a large kappa value of 0.834 (p < 0.001).

Receiver operating characteristic curves for the analyzed scores are shown in Fig. 3 and Table 6.

Table 3 Predictive values for 7-day mortality in acute pulmonary embolism by using of different scores
Table 4 Predictive values for 30-day mortality in acute pulmonary embolism by using of different scores
Table 5 Pairwise comparison of the scores by kappa coefficient and McNemar test
Fig. 3
figure 3

Receiver operating characteristic curves for the analyzed scores (30 day mortality)

Table 6 Receiver operating characteristic curves for the analyzed scores (30-day mortality)

Discussion

Risk estimation in acute pulmonary embolism (APE) is crucial due to its high mortality rate. Several scores are used to stratify APE, with the simplified pulmonary embolism index (sPESI) being commonly employed in clinical practice.

sPESI is a clinical score that consists of six equally weighted variables, each assigned a point: age > 80 years, presence of cancer, chronic heart failure or chronic pulmonary disease, systolic blood pressure < 100mmHg, and arterial oxyhemoglobin saturation < 90% [13]. This score effectively distinguishes patients with a low risk of mortality (0 points) [13]. However, the literature suggests that sPESI fails to identify patients with intermediate risk, such as those with right ventricular dysfunction (RVD) and/or elevated cardiac biomarkers [10]. This suggests that the precision of sPESI can be enhanced by incorporating signs of right heart injury. In the ESC 2019 guidelines, in addition to sPESI (≥ 1 point), the need for vasopressors, signs of RVD on echocardiography or computed tomography, and cardiac troponin elevation are included [10]. ESC has been validated in numerous studies and exhibits high accuracy. For instance, it can predict adverse events in APE with a sensitivity of 96% (95% CI 79–99) and a negative predictive value of 99% (95% CI 95–99), while its specificity and positive predictive value are 57% (95% CI 53–62) and 11% (95% CI 8–17), respectively [14]. This is mirrored in the present analysis analysis focusing on the prediction of 7-day and 30-day mortality. ESC shows a high sensitivity (96.8% and 95.3%, respectively) and negative predictive value (99.0% and 97.1%, respectively), but low specificity and negative predictive value.

PEMS is another combined score that incorporates sPESI, clinical parameters, and radiological signs of RVD. This score includes sPESI (< 2 points), pH values, clinical indicators (minimal diastolic blood pressure, need for vasopressors), and radiological signs of RVD (IVC reflux grade 3). In the original study, PEMS demonstrated the following diagnostic values: sensitivity 84.9%, specificity 83.0%, positive predictive value 51.8%, and negative predictive value 96.2% [8]. The present study is the first validation of PEMS, showing that it exhibits the highest specificity (93.4% and 94.2% for 7- and 30-day mortality, respectively) in the cohort, along with a high negative predictive value (96.3% and 90.7%, respectively).

The BOVA score is a model that solely incorporates clinical and radiological signs of RVD [9]. It demonstrates a high area under the curve value for predicting in-hospital death (0.908 [15]. The BOVA score has been validated in numerous studies. According to a meta-analysis, the pooled sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio of the BOVA score for predicting short-term composite adverse outcomes are 0.25 (95% CI, 0.22–0.29), 0.93 (95% CI, 0.92–0.93), 4.05 (95% CI, 2.90–5.67), and 0.81 (95% CI, 0.74–0.88), respectively [16]. Our results align with the reported data, indicating a moderate specificity for 7-day mortality (78.4%), but a good for 30-day mortality (90.4%) and a high negative predictive value (96.6% and 90.4%, respectively) for the BOVA score. Its observed sensitivity and positive predictive value are low.

The Kumamaru score primarily relies on radiological findings and incorporates multiple features, such as pleural effusion, pericardial effusion, and enlarged lymph nodes in the thorax [11]. According to Kumamaru et al., it exhibits a high area under the curve value for predicting 30-day mortality (0.89) [11]. However, there have been no investigations validating this score to date. In our cohort, the Kumamaru score demonstrates low sensitivity, specificity, and positive predictive value for both 7- and 30-day mortality.

The CAPE score is a recently published model that incorporates clot burden, as well as radiological and clinical signs of RVD [12]. Unfortunately, the prognostic values (sensitivity, specificity) of this score regarding mortality are not yet known. Our study represents the first validation of the CAPE score, indicating low sensitivity (35.5% and 23.4%, respectively) and moderate specificity (85.0% and 84.8%, respectively) in our cohort.

To the best of our knowledge, this is the first study to compare the diagnostic performance of different clinical-radiological scores reported for predicting 7- and 30-day mortality in APE. Overall, our findings demonstrate that some of the proposed scores exhibit low sensitivity and/or specificity, while all scores display low positive predictive values. However, all scores demonstrate high negative predictive values ranging from 95.5 to 99.0% for 7-day and 89.0–97.1% for 30-day mortality.

In the absence of a standardized definition for PE-related death, we chose all-cause mortality as a reasonable endpoint to express consequences of PE in patients. We chose all-cause 7- and 30-day mortality as the primary outcome. Jimenez et al. have shown in an analysis of the RIETE registry that 7-day all-cause mortality is close to 7-day PE-related mortality (1.8% and 1.1% for the time frame 2010–2013) [17]. The gap was wider for 30-day mortality, with 4.9% all-cause mortality and 1.8% PE-related mortality. PE-related mortality was defined as deaths confirmed by autopsy or those following severe PE. In the absence of an alternative diagnosis, death was judged as due to fatal PE. A similar gap for mortality was found in the EMPEROR study [18].

The present study has some limitations. It was a retrospective study in a monocenter setting. It is possible that patients in a critical clinical condition did not undergo CT-scans, leading to selection bias. The diagnosis of APE was based on CT-scans only. Patients with contraindications to radiation or contrast medium were excluded from our cohort. However, our results are based on a large cohort.

In conclusion, for prognosis 7- and 30-day mortality in APE, PEMS score has the highest specificity. ESC 2019 shows the highest sensitivity. All scores had low positive and high negative predictive values.