Introduction

Coronavirus disease 2019 (COVID-19) has been declared a pandemic emergency by the World Health Organization on March 11, 2020 [1]. This infectious disease can result in a range of clinical outcomes, from an asymptomatic or mild flu-like illness to severe pneumonia, multiorgan failure, and even death [2]. The diagnosis of COVID-19 is based on the detection of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by real-time reverse transcriptase–polymerase chain reaction (RT-PCR) testing, most commonly of a nasopharyngeal swab [3]. However, this method has some limitations: it is not universally available, turnaround times can be lengthy, and reported sensitivities vary (30–70%) [4, 5].

In the context of the COVID-19 pandemic, imaging has turned out to be a valuable complementary tool to “rule-in” or “rule out” suspected COVID-19 patients, potentially accelerating the speed of diagnosis compared with RT-PCR dynamics [6]. The choice of whether to use chest radiography (CXR) or computed tomography (CT) as a first-line imaging modality for the assessment of COVID-19 depends on factors that vary considerably among scenarios (e.g., local resources, expertise) [7]. In Europe, as well as in the USA, CXR has been extensively used to triage patients with clinical suspicion of COVID-19 [8, 9]. Even though CXR is less sensitive than CT, especially in the early stage of the disease, it is widely available and relatively inexpensive, can be performed at the bedside, and allows relative rapid cleaning and turn-over between patients, thus minimizing the risk of cross-link infection [10].

The spectrum of chest imaging manifestations of COVID-19 on CXR has been extensively described [11, 12]. However, while the utility of CXR in predicting clinical outcomes has been investigated in the severe acute respiratory syndrome (SARS) coronavirus as well as in a variety of other types of pneumonia [13, 14], very few studies have assessed the prognostic value of CXR in COVID-19 patients [15, 16]. Moreover, data about the reproducibility of CXR findings in COVID-19 still lack.

Therefore, our study aimed (a) to evaluate the inter-rater agreement of initial CXR findings in COVID-19 patients presenting to the emergency department (ED) during the early stage of the pandemic and (b) to determine the value of initial CXR findings combined with demographic, clinical, and laboratory data at ED presentation for predicting mortality and the need for ventilatory support in COVID-19 patients.

Materials and methods

The Institutional Review Board (Comitato Etico di Bergamo, Italy) approved this retrospective observational study and waived the written informed consent.

Study population

A total of 359 consecutive patients presenting to the EDs of two affiliated hospitals (Papa Giovanni XXIII and San Giovanni Bianco, Bergamo, Italy) between March 1 and 13, 2020, were considered eligible for inclusion. The inclusion criteria were the following: (a) initial CXR performed in the ED setting and (b) final diagnosis of COVID-19 confirmed by a positive RT-PCR test. The exclusion criteria were (a) unavailable clinical or laboratory data (n = 5) and (b) non-diagnostic CXR image quality (n = 14). Finally, a total of 340 patients were retrospectively enrolled.

Data collection

Demographic, clinical, and laboratory data were collected from patients’ medical records. The recorded data included the following: age, sex, medical comorbidities, symptoms, clinical and laboratory data within 24 h of ED presentation (including the oxygen saturation [SpO2], fraction of inspired oxygen [FiO2], arterial partial pressure of oxygen [PaO2], and PaO2/FiO2 ratio), and mode of respiratory support (oxygen mask, continuous positive airway pressure/noninvasive mechanical ventilation, invasive mechanical ventilation). For patients admitted to the intensive care unit, the highest levels of positive end-expiratory pressure, the use of extracorporeal membrane oxygenation, and prone positioning were also recorded. Patient length of stay was calculated by subtracting the date of ED presentation from the date of discharge or death. Patient survival status, as well as the date of death, was obtained from the Regional Healthcare Information System (SISS, Regione Lombardia, Italy) as of May 12, 2020.

Imaging acquisition and analysis

Images were acquired using digital radiographic systems (Definium 8000, GE Healthcare; FDR AcSelerate, Fujifilm Corporation) with tube voltages ranging from 120 to 150 kVp and by employing automatic exposure control. The imaging data included CXR images acquired in the posteroanterior and lateral (PA/LAT, n = 130) or anteroposterior (AP, n = 210) projections. The latter was performed when the patient was too unwell to tolerate standing. Only the AP and PA images were selected and retrospectively evaluated by two reviewers (G.M., a thoracic radiologist with 5 years of experience in a referral center; M.B., a fourth-year radiology resident), blinded to patient history other than COVID-19 positivity. Reviewers independently assessed the presence of lung abnormalities, including ground-glass opacities (GGOs), consolidation, and pulmonary nodules [17]. Distribution of GGOs and consolidation was classified as follows: (a) peripheral (involving mainly the peripheral one-third of the lung), central (involving mainly the central two-thirds of the lung), or neither; (b) unilateral or bilateral; (c) upper zone (above the inferior wall of the aortic arch), middle zone (between the inferior wall of the aortic arch and the right inferior pulmonary vein), lower zone (below the right inferior pulmonary vein), or no zonal predominance. The presence of pleural effusion was assessed. The two reviewers were also asked to grade each CXR using the Brixia scoring system, an experimental 18-points severity scoring system designed for the assessment of COVID-19 pneumonia [18]. The Brixia score is obtained by dividing each lung into 3 zones (upper, middle, and lower zone, as explained above) and then scoring each zone from 0 to 3 based on types of pulmonary infiltrates detected, as follows: 0, no lung abnormalities; 1, interstitial infiltrates; 2, interstitial and alveolar infiltrates (interstitial predominance); 3, interstitial and alveolar infiltrates (alveolar predominance). The terms “interstitial infiltrate” and “alveolar infiltrate” used in the Brixia scoring system were reported in the current study as GGO and consolidation, respectively. CXRs showing only abnormalities other than GGOs and consolidation were scored as 0. Pure consolidation was scored as 3. The number of zones involved was also recorded. In addition, the overall extent of GGOs and consolidation was assessed by visually estimating and then averaging the percentage of involvement within each lung.

Statistical analysis

Patient data and CXR findings were reported as median and interquartile range (IQR) in case of continuous variables, or numbers and frequency distribution (%) in case of binary or categorical variables.

CXR findings’ inter-rater agreement was assessed both in the whole group and in subgroups with AP and PA radiographs by weighted Cohen’s kappa (categorical variables), or intraclass correlation coefficient (ICC) (quantitative variables, namely the number of lung zones involved, Brixia score, and percentage of lung involvement). Moreover, the agreement in Brixia score and percentage of lung involvement was visualized by correlation and Bland-Altman plots.

Predictors of death and mode of respiratory support were identified among age, sex, comorbidities, duration of symptoms, SpO2, and PaO2/FiO2 ratio, as well as CXR findings (laterality, type of parenchymal opacity, number of lung zones involved, Brixia score, and percentage of lung involvement), by logistic and ordinal logistic regression, respectively. In all cases, univariate analyses were first performed to identify possible predictors. All variables with significant contributions at univariate analysis were included in the multivariate analysis, and main predictors were finally identified by reducing the multivariate model using a stepwise model selection technique. CXR findings refer to the most experienced reviewer, and only patients with no missing data were included in the regression analyses.

Significance of the differences in demographic, clinical, and laboratory data between patients with mild (Brixia score < 8) and severe (Brixia score ≥ 8) CXR findings was assessed by two-tail independent t test (continuous variables) or chi-squared test (binary and categorical variables). Significance of the differences in the demographic, clinical, laboratory, and radiological features between deceased and survived patients was assessed by the Mann-Whitney test (numerical variables) or chi-squared test (binary and categorical variables).

Survival curves and pertinent 95% confidence intervals were computed using the Kaplan-Meier method for the whole patient cohort as well as for patients grouped by individual grouping variables (age, sex, number of comorbidities, PaO2/FiO2 ratio, CXR findings at ED presentation). The significance of the difference between strata was computed by log-rank test. The distribution by the most invasive respiratory support employed of age, PaO2/FiO2 ratio, Brixia score, percentage of lung involvement, and number of lung zones involved was displayed by boxplots.

In all tests, statistical significance was set at p < 0.05. All statistical analyses were performed using R software, version 3.6.3.

Results

The main demographic, clinical, and laboratory features at ED presentation of the 340 COVID-19 patients included in the study are listed in Table 1. Most patients were male (252/340, 74%). The median age was 68 (IQR = 57–76). Arterial hypertension represented the most common comorbidity (162/340, 48%), followed by cardiovascular diseases (86/340, 25%). The median number of days from symptom onset to ED presentation was 7, with the most common symptoms being fever (296/340, 87%), dyspnea (224/340, 66%), and cough (167/340, 49%). The main blood test alterations were lymphocytopenia (131/190, 69%), increased levels of C-reactive protein (323/333; 97%), lactate dehydrogenase (278/306, 91%), and aspartate transaminase (215/331, 65%). The median PaO2/FiO2 ratio was 238 (IQR = 143–285).

Table 1 Summary of data obtained within 24 h of ED presentation in 340 patients with confirmed COVID-19

All patients underwent CXR and RT-PCR testing within the first 24 h of ED presentation. Initial RT-PCR tests were performed using nasopharyngeal swabs, according to the protocol established by the World Health Organization [3]. A total of 313 (92%) enrolled patients had a positive initial RT-PCR test result, while 27 had a negative one. The latter were found to have a positive result at a second (n = 20) or third (n = 5) RT-PCR test from a nasopharyngeal swab, up to a maximum of 8 days after ED referral. Only in 2 cases, the diagnosis of COVID-19 was confirmed by a positive RT-PCR test from bronchoalveolar lavage fluid performed 4 and 14 days after the ED presentation, respectively. All of the 27 patients who tested negative on initial RT-PCR had CXR findings suggestive of pneumonia, while 6 patients were negative for both reviewers on CXR and tested positive on initial RT-PCR. There were no statistically significant differences in CXR findings between patients with first positive nasopharyngeal swab and those who became positive afterward.

The inter-rater agreement of CXR findings was almost perfect for the assessment of type of parenchymal opacity (κ = 0.90; 95% CI: 0.85, 0.95), Brixia score (ICC = 0.91; 95% CI: 0.89, 0.93), and percentage of lung involvement (ICC = 0.95; 95% CI: 0.93, 0.96) [19] (Fig. 1). Notably, AP images showed an overall better inter-rater agreement than PA (Supplementary Material, Table S1). GGO admixed with consolidation was the most common finding (235/340, 69%), followed by GGO (96/340, 28%) (Fig. 2). Parenchymal opacities most frequently showed neither a peripheral nor a central distribution (219 out of 334 with parenchymal opacities, 65%) or were peripherally located (99/334, 30%). Bilateral lung involvement was found in 312 cases (93%) (Table 2). Patients with severe CXR findings more frequently suffered from dyspnea and were more likely to have laboratory abnormalities, including lower SpO2 and PaO2/FiO2 ratio values and raised inflammatory markers, liver enzymes, and creatinine levels (Supplementary Material, Table S2).

Fig. 1
figure 1

Correlation and agreement between chest X-ray findings obtained by two independent reviewers in 340 patients with confirmed COVID-19. Correlation and Bland-Altman plots show the agreement in Brixia score (a, b) and percentage of lung involvement (c, d) between the reference reviewer (reviewer 1, a thoracic radiologist with 5 years of experience) and reviewer 2 (a fourth-year radiology resident). In correlation plots, the dashed line denotes the line of perfect concordance, while the solid line denotes the reduced major axis. In Bland-Altman plots, the solid line denotes mean difference, while dashed lines denote mean difference ± 2 standard deviations

Fig. 2
figure 2

Chest X-ray (CXR) findings at the emergency department presentation in two patients with confirmed COVID-19 and opposite outcomes. a CXR shows bilateral, mostly peripheral, ground-glass opacities (GGOs) admixed with consolidation (consolidation-predominant) (arrowheads). Reviewer 1 assigned a Brixia score of 14 and a percentage of lung involvement of 60%. Reviewer 2 assigned a Brixia score of 15 and a percentage of lung involvement of 50%. This patient had a prolonged stay in the intensive care unit and died 11 days after presenting to the emergency department. b CXR shows bilateral GGOs, either pure (empty arrowheads) or admixed with consolidation (GGO-predominant) (solid arrowhead). Reviewer 1 assigned a Brixia score of 6 and a percentage of lung involvement of 30%. Reviewer 2 assigned a Brixia score of 5 and a percentage of lung involvement of 25%. This patient was discharged from the emergency department after a short-term observation with home care and isolation precautions and was alive at the end of the study period

Table 2 Chest X-ray analysis results obtained by two independent reviewers (reviewer 1, a thoracic radiologist with 5 years of experience; reviewer 2, a fourth-year radiology resident) in 340 patients with confirmed COVID-19

The main patients’ outcomes are listed in Table 3. Median observation time was 63 days (IQR = 8–67). The two most frequent respiratory supports employed were oxygen mask (144/340, 42%) and continuous positive airway pressure/noninvasive mechanical ventilation (105/340, 31%). Death occurred in 37% of cases (125/340, median age of 76). A total of 58 patients (17%, median age of 60 years) were admitted to ICU, among which 22 died (38%, median age of 66).

Table 3 Outcomes of 340 patients with confirmed COVID-19. Follow-up information are reported as of May 12, 2020

Deceased patients were significantly older and had a higher number of comorbidities, significantly lower SpO2 and PaO2/FiO2 ratio values, and more severe CXR findings at ED admission than patients who survived (p < 0.001 in all cases; Table 4). Significant differences in survival curves between age classes, PaO2/FiO2 ratio values, and several CXR findings (Brixia score, number of lung zones involved, and percentage of lung involvement) (p < 0.001 in all cases) were found (Fig. 3).

Table 4 Demographic, clinical, chest X-ray, and laboratory data of 340 patients with confirmed COVID-19 at ED presentation divided in groups based on their clinical outcome: deceased or survived
Fig. 3
figure 3

Survival curves related to 340 patients with confirmed COVID-19, grouped by demographic variables (a: age, b: sex), PaO2/FiO2 ratio (c), and chest X-ray findings at presentation to the emergency department (d: Brixia score, e: number of lung zones involved, f: percentage of lung involvement), over a median of 63 days observation time. Shadows denote 95% confidence intervals, while p denotes the significance of the difference between strata at log-rank test. PaO2/FiO2 ratio, ratio of partial pressure of oxygen to fraction of inspired oxygen

On regression model analysis, the Brixia score (OR: 1.19; 95% CI: 1.06, 1.34; p = 0.003), age (OR: 1.16; 95% CI: 1.11, 1.22; p < 0.001), PaO2/FiO2 ratio (OR: 0.99; 95% CI: 0.98, 1; p = 0.002), and cardiovascular diseases (OR: 3.21; 95% CI: 1.28, 8.39; p = 0.014) significantly predicted death. Percentage of lung involvement (OR: 1.02; 95% CI: 1.01, 1.03; p = 0.001), SpO2 (OR: 0.96; 95% CI: 0.92, 0.99; p = 0.008), PaO2/FiO2 ratio (OR: 0.99; 95% CI: 0.99, 1.00; p < 0.001), and rheumatic pathologies (OR: 3.22; 95% CI: 1.05, 9.89; p = 0.041) predicted the need for ventilatory support (Table 5). The distribution of age, PaO2/FiO2 ratio, Brixia score, number of lung zones involved, and percentage of lung involvement by respiratory support employed is shown in Fig. 4.

Table 5 Demographic, clinical, laboratory, and chest X-ray data that demonstrated predictive value for death and the most invasive respiratory support employed (none, oxygen mask, continuous positive airway pressure/noninvasive ventilation or invasive ventilation) in 340 patients with confirmed COVID-19
Fig. 4
figure 4

Distribution of age (a), PaO2/FiO2 ratio (b), and CXR findings (c: Brixia score, d: percentage of lung involvement, e: number of lung zones involved) at presentation to the emergency department by the most invasive respiratory support employed in 340 patients with confirmed COVID-19. CXR, chest X-ray; ED, emergency department; PaO2/FiO2 ratio, ratio of partial pressure of oxygen to fraction of inspired oxygen; OM, oxygen mask; CPAP/NIV, continuous positive airway pressure/noninvasive mechanical ventilation; IV, invasive mechanical ventilation

Discussion

The ongoing COVID-19 pandemic has highlighted the need for prompt diagnostic and prognostic strategies to optimize patient management, especially when the availability of critical care resources is limited or overwhelmed. In the present study, the vast majority of COVID-19 patients (334/340, 99%) had signs of pneumonia on CXR, even those who tested negative at initial RT-PCR. Presence, distribution, and type of parenchymal opacity (i.e., GGO, consolidation, or both), as well as the Brixia score and the percentage of lung involvement, were consistently assessed by two independent reviewers with different levels of expertise. We found that a higher Brixia score, increasing age, underlying cardiovascular diseases, and lower PaO2/FiO2 ratio values were significant predictors of death. The main predictors of the need for ventilatory support were found to be a higher percentage of lung involvement on CXR, the presence of rheumatic pathologies, and lower SpO2 and PaO2/FiO2 ratio values.

A few studies have examined the value of CXR to predict COVID-19 outcomes. In early May 2020, Borghesi et al introduced the Brixia score, an experimental CXR scoring system for quantifying lung abnormalities in COVID-19 pneumonia [18]. High Brixia score values have been found to predict in-hospital mortality for COVID-19 [15]. Also, Toussie et al found that a lung zone severity score on the initial CXR was associated with the need for intubation in COVID-19 patients aged 21–50 years [16]. No studies have investigated the value of initial CXR to predict mortality in COVID-19 patients so far. In the present study, we first demonstrated that Brixia score on initial CXR is predictive of fatal outcome (based on in-hospital and out-of-hospital deaths) in COVID-19 patients. Moreover, in keeping with Toussie et al [16], we found the percentage of lung involvement to be a predictor of the need for ventilatory support. Although the Brixia score and the percentage of lung involvement were significant predictors of both mortality and ventilatory support in the univariate analysis, only one of them remained significant in each outcome’s multivariate model because the two scores provide partially overlapping information. The number of lung zones involved and type of parenchymal opacities were also significantly different between survivors and deceased patients, the latter more frequently presenting with a greater degree of lung involvement and consolidation. Remarkably, the overall inter-rater agreement was better for AP images, where the higher disease severity may have led to relatively obvious CXR findings.

Our study population was mainly composed of patients presenting in a relatively advanced stage of the disease, with a median number of days from symptom onset to ED presentation of 7. The proportion of normal CXRs (6/340, 2%) was, therefore, significantly lower than those reported in previous studies where patients presented earlier in the course of their disease [11, 16, 20]. In accordance with reports showing the highest radiological severity of the disease approximately 6–11 days after the onset of symptoms [21, 22], we found CXR signs of advanced pneumonia in a high proportion of patients: GGO admixed with consolidation (235/340, 69%), bilateral parenchymal opacities (312/334, 93%), and a median percentage of lung involvement of 55. Pleural effusion was recorded in a higher percentage of patients (53/340, 16%) than that previously reported [11].

Unlike previous findings [23], we did not find any significant differences between patients with a positive initial RT-PCR result and those who became positive afterward, both presenting with a high prevalence of GGO admixed with consolidation (68% and 81%, respectively). Moreover, all of the 27 patients who tested negative on initial RT-PCR had CXR findings suggestive of pneumonia, thus underlining the potential of CXR as a valuable complementary diagnostic tool in the first-line work-up of suspected COVID-19 patients.

In accordance with Du et al [24], our findings confirm that older age and cardiovascular diseases predict fatal outcome in COVID-19 patients. As expected, a greater number of comorbidities, hypertension, and diabetes were also found to be associated with death. However, in accordance with previous findings [25], these parameters did not remain significant predictors of mortality in multivariate analysis. In line with a recently published larger series, neither smoking nor obesity (defined as BMI ≥ 30) was found to be associated with death [26].

PaO2/FiO2 ratio was found to be a significant predictor of death and the need for ventilatory support, while SpO2 was a significant predictor of the need for ventilatory support only. PaO2/FiO2 ratio, as a surrogate of hypoxia, was previously found to be a predictor of death in other types of pneumonia [27] and to appear significantly lower in patients deceased of COVID-19 compared with those who survived [28]. In our cohort, most patients presented with a mild respiratory failure (median PaO2/FiO2 ratio = 238), and PaO2/FiO2 ratio at ED presentation was significantly reduced in deceased patients compared with that in survivors (179 vs. 262, p < 0.001). Our results provide further evidence in support of the PaO2/FiO2 ratio as a critical parameter to assess disease severity in patients with severe respiratory symptoms due to COVID-19.

The “protective” role of increasing age against the need for ventilatory support found in the present study can be safely considered artifactual. Reasonably, this result has been influenced by the extraordinary distribution of limited healthcare resources, preferentially allocated to patients with a higher possibility of therapeutic success and life expectancy.

The present study has some limitations. First, in such an emergency, the completeness of data recorded was less than optimal. Moreover, our cohort attended the ED after several days from symptom onset and in a relatively advanced disease stage, thus making the generalizability of our results uncertain. Also, such an imbalance between clinical needs and availability of intensive care resources has reasonably led to discrepancies between disease severity or clinical outcomes and mode of respiratory support employed. Lastly, given the very low number of CT scans performed in our institution in this emergency situation and considering that most of them were not acquired at ED presentation but rather later in the disease course, a comparison between CT and CXR findings was not feasible.

In conclusion, CXR is a reproducible tool for assessing COVID-19. Along with patient history, SpO2, and PaO2/FiO2 ratio values, CXR at ED presentation may help to identify patients at risk for death and ventilatory support, thus enabling to optimize clinical management in high-prevalence settings of the disease.