Introduction

The clinical spectrum of SARS-CoV-2 infection ranges from asymptomatic infection to a disease (COVID-19) that can lead to potentially fatal respiratory failure [1, 2].

The outbreak of SARS-CoV-2 has rapidly spread worldwide since the end of 2019, with Italy being one of the first and most affected countries facing the epidemic outside of China [3, 4].

The diagnosis of COVID-19 currently relies on reverse transcription–polymerase chain reaction (RT-PCR) conducted on oropharyngeal and/or nasopharyngeal swabs. However, while false positives are conceivably rare, false negatives can occur, even in patients with pneumonia, who may have negative nasal/oropharyngeal samples but positive lower airway samples. The true clinical sensitivity of RT-PCR is thus unknown [5, 6]. Moreover, the huge demand for tests is compromising their availability in some areas [5] and creating frequent delays in diagnosis confirmation, with consequences on timely treatment, isolation, and contact tracing.

Computed tomography (CT) often shows typical findings in COVID-19 pneumonia, especially bilateral patchy ground-glass opacities and consolidation, with a predominantly peripheral distribution. Crazy-paving pattern, peripheral vessel enlargement, and findings of organizing pneumonia such as reverse halo sign have also been described [7,8,9,10,11]. However, CT may show no abnormalities, especially early after symptom onset [12], and CT findings are not specific, significantly overlapping with other infections [13].

The role of CT in the COVID-19 epidemic is debated. Its use as a screening tool has been proposed [14, 15] but is highly discouraged by major radiology societies in Western countries [10, 16]. European and American societies underline the need for RT-PCR for diagnostic confirmation [10, 16], even though they suggest repetition of RT-PCR in cases of suggestive CT findings in symptomatic patients [10]. Finally, in patients with respiratory symptoms such as dyspnea and desaturation, CT may help stratify disease severity and patient prognosis [10]. The Fleischner Society has recently released a multinational consensus statement suggesting adapting the use of chest imaging to different clinical scenarios according to the severity of clinical features, pre-test probability of COVID-19, and resource constraints [17].

The aim of this study was to assess the sensitivity and specificity of CT vs RT-PCR for the diagnosis of COVID-19 pneumonia in a prospective Italian cohort of symptomatic patients presenting to the emergency room (ER) for suspected COVID-19 during the outbreak peak. We also compared relevant blood tests in classes of patients with different combinations of CT and RT-PCR results.

Methods

Setting

In the Reggio Emilia province (Northern Italy, 532,000 inhabitants, six hospitals), the first case of SARS-CoV-2 infection was diagnosed on February 27, 2020. Up to March 24, there were 1200 RT-PCR-confirmed cases and the epidemic was still spreading. The study was approved as a retrospective analysis by the Area Vasta Emilia Nord Ethics Committee on 04/07/2020 (protocol number 2020/0045199). Patients’ written consent to publish their images was obtained, and patients’ informed consent to participate in the study was obtained whenever possible, given the retrospective nature of the study.

Study design

This was a cross-sectional study assessing sensitivity and specificity of CT for COVID-19 pneumonia at two different thresholds of suspicion, using RT-PCR as the reference standard.

Study population

All consecutive patients who presented to the Reggio Emilia province ERs between March 13 and March 23 for suspected COVID-19 and underwent both CT and RT-PCR were eligible. Subjects with a time gap between CT and RT-PCR > 3 days were excluded.

During the COVID-19 outbreak, the diagnostic protocol for these patients included nasopharyngeal and oropharyngeal swabs for RT-PCR, blood tests, chest X-rays, and CT scan in cases of suggestive X-ray findings or negative X-rays but highly suggestive clinical features. A structured CT report was introduced on March 13.

Reference standard

Two issues hamper the measure of CT accuracy for COVID-19. Firstly, the clinical sensitivity of RT-PCR, our reference standard, although not yet quantified, is not 100% [5]. Secondly, the target condition of our index test, i.e., CT, is viral pneumonia, while RT-PCR target condition is SARS-CoV-2 infection.

While the second issue cannot be easily solved, to overcome the first problem, we used different definitions of reference standard: (1) the first RT-PCR within 3 days after CT; (2) the first RT-PCR and, if negative, repeated RT-PCR tests in the following 15 days; if not repeated, the patient was considered non-COVID19; (3) as in definition 2, but RT-PCR-negative patients who were not retested were classified as COVID19 or non-COVID19 in the same proportion as patients who were actually retested in that group of CT results.

A commercial One-Step Reverse Transcription RT-PCR (GeneFinder ™ COVID -19 PLUS Real Real Amp Kit) was used and RT-PCR assay was performed on an Applied Biosystems 7500 Sequence Detection System.

We also report an estimate of RT-PCR sensitivity using retesting as the reference standard, assuming that those retested were a random sample of the RT-PCR-negative patients.

CT acquisition technique

CT scans were performed using one of three scanners (128-slice Somatom Definition Edge, Siemens Healthineers; 64-slice Ingenuity, Philips Healthcare; 16-slice GE Brightspeed, GE Healthcare) without contrast media injection, with the patient in supine position, during end-inspiration. Scanning parameters were as follows: tube voltage 120 KV, automatic tube current modulation, collimation width 0.625 or 1.25 mm, acquisition slice thickness 2.5 mm, and interval 1.25 mm. Images were reconstructed with a high-resolution algorithm at slice thickness 1.0/1.25 mm. Patients wore face masks, and thorough decontamination of the room was performed after each patient.

CT analysis and structured reporting

During routine reporting, each radiologist completed both the usual radiology report as well as a structured report about the probability of COVID-19 pneumonia based on CT findings (highly suggestive, suggestive, non-suggestive) (Fig. 1), the presence/absence of ground-glass opacities and consolidations, and the extension of pulmonary lesions using a visual scoring system (< 20%, 20–40%, 40–60%, and > 60% of parenchymal involvement) (Fig. 2). Swab results were unknown when reporting, so radiologists were blinded to RT-PCR. However, they were frequently informed of blood test results and of patients’ clinical features. An example of the format used for the structured reporting is provided in Supplementary Material.

Fig. 1
figure 1

Exemplification of classification of CT findings: a CT findings highly suggestive of COVID-19 pneumonia, with bilateral interstitial involvement, patchy ground-glass opacities (arrow), and peripheral consolidations (*), confirmed by positive RT-PCR; b CT findings suggestive of COVID-19 pneumonia, with unilateral peripheral consolidation and subtle ground-glass opacities, confirmed by positive RT-PCR; c CT findings non-suggestive of COVID-19 pneumonia, with mostly unilateral bronchial wall thickening, endobronchial secretions, tree-in-bud nodules, and consolidation, confirmed by negative RT-PCR

Fig. 2
figure 2

Visual scoring system used to classify the extension of parenchymal involvement: < 20% (a), 20–40% (b), 40–60% (c), and > 60% (d)

Blood tests

When available, C-reactive protein (CRP) level, LDH, total leukocyte, lymphocyte, neutrophil, and platelet counts measured on ER admission were collected. The tests were carried out in the Hospital Clinical Laboratories with routine automated methods. Oxygen saturation level (SpO2) was also collected for patients who had it measured before being provided with oxygen support.

These tests were included since previously associated with COVID-19 diagnosis, severity, and prognosis: increased CRP reflects host inflammatory response, along with increased total leukocyte and neutrophil counts; elevated LDH concentrations may be a sign of end-organ damage; decreased platelet count may be associated with an underlying coagulopathy; lymphopenia may represent a concomitant immune dysfunction and has been associated with increased disease severity and worse prognosis; SpO2 provides information on lung damage and functionality [18, 19].

Statistical analyses

CT scan sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for COVID-19 pneumonia were computed for two different thresholds: highly suggestive only and highly suggestive plus suggestive findings.

Accuracy measures according to the three reference standard definitions reported above were calculated with relative 95% confidence intervals (CI) computed on the exact binomial distribution.

Distribution of clinical characteristics across groups of CT and RT-PCR results is reported. Associations between clinical characteristics and CT and RT-PCR classes were measured through Pearson’s chi2 and Fischer’s exact distribution. For blood tests, we report mean (± SD) and median (IQR) stratified by groups of CT and RT-PCR results. Comparisons between groups were conducted with one-way ANOVA and linear regression models adjusted for sex and age.

P values are reported as continuous measures and no prefixed significance threshold was used.

We used Stata 13.0 SE (Stata Corporation) software package.

Results

Population

After excluding 6 patients with a > 3-day interval between CT and RT-PCR, we included 696 patients who underwent CT and RT-PCR in the Reggio Emilia provincial ERs between March 13 and 23 for suspected COVID-19 (Fig. 3). Of these 696 patients, 288 (41.4%) were women. Overall mean age was 59 years (SD 15.8), 58.3 years (SD 16.5) for women, and 59.5 years (SD 15.2) for men.

Fig. 3
figure 3

Flowchart representing the study population and the subgroups with different combinations of CT and RT-PCR results

Overall, 454 (65%) had CT findings which were judged by the reporting radiologist as highly suggestive of COVID-19, 127 (18%) as suggestive, and 115 (17%) as non-suggestive.

Among patients with highly suggestive CT findings, 423 (93%) had positive and 31 (7%) negative RT-PCR at the first swab performed on ER admission. Of the 127 patients with suggestive CT findings, 97 (76%) had positive and 30 (24%) negative RT-PCR at the first swab, while 31/115 (27%) patients with non-suggestive CT findings had positive RT-PCR at the first swab.

The proportion of women decreased with the increase in CT suspicion, while age was slightly higher in the group with suggestive CT. The proportion of hospitalizations and deaths significantly increased with the increase in CT suspicion (Table 1).

Table 1 Age, sex, outcomes, and blood tests in CT classes

Overall, 31 patients with negative RT-PCR were retested within 15 days, 12 of whom (38.7%) resulted positive: 5 out of the 11 with CT findings highly suggestive of COVID-19 pneumonia, 1 out of the 8 with suggestive CT findings, and 6 out of the 19 with non-suggestive CT (Fig. 4). Assuming this rate of false negatives, the sensitivity of RT-PCR using repeated test as the reference standard was 92.3% (95% CI 86.9–96.0).

Fig. 4
figure 4

Examples of discordant cases between CT and RT-PCR. a Focal polygonal consolidation without ground-glass opacity, considered as non-suggestive at CT scan but resulting in positive RT-PCR test. b Bilateral (mostly right) patchy ground-glass opacities, with small areas of consolidation, which was classified as highly suggestive, resulting in a first negative RT-PCR, followed by a positive RT-PCR performed 7 days later

The prevalence of RT-PCR positives was 79.2%, 80.9%, and 85.6% according to reference standard definitions 1, 2, and 3, respectively.

CT findings

Ground-glass opacities were present in almost all patients with highly suggestive (453/454) and suggestive (123/127) CT findings, and in 40% (44/115) of patients with non-suggestive findings. When considering subgroups of RT-PCR results, in all the three CT classes, ground-glass opacities were more frequent in patients with positive RT-PCR.

Consolidation was present in 279/454 (61.5%) patients with highly suggestive, 76/115 (59.8%) patients with suggestive, and 41/115 (35.7%) patients with non-suggestive CT findings. The distribution of consolidation among subgroups of positive and negative RT-PCR patients in the three CT classes varied, being more frequent in negative RT-PCR patients in the intermediate CT class, i.e., suggestive.

A limited (< 20%) extension of pulmonary lesions was more frequently present in patients with suggestive and non-suggestive CT findings. Parenchymal extension was not estimated in 55/115 patients with non-suggestive findings; 41/55 had a completely normal CT scan, while 14/55 presented pleural effusion without parenchymal involvement. Interestingly, among the 31 patients with non-suggestive CT and positive RT-PCR, 13 patients had no CT abnormalities and 1 had only pleural effusion (Table 2).

Table 2 CT findings in CT classes

CT diagnostic accuracy

The sensitivity, specificity, PPV, and NPV of CT at two different thresholds (highly suggestive findings only and highly suggestive plus suggestive findings) with respect to the three different reference standards described in the “Methods” section are reported in Table 3. CT sensitivity ranged from 73 to 77% and from 90 to 94% for high and low positivity threshold, respectively. Specificity ranged from 79 to 84% for high positivity threshold and was about 58% for low positivity threshold. PPV remained ≥ 90% in all cases, and NPV ranged from 50 to 73% and from 35 to 47% for low and high positivity threshold, respectively.

Table 3 CT diagnostic accuracy

Blood tests

Means (SD) and medians (IQR) across CT classes are reported in Table 1.

Among RT-PCR-negative patients, the values of total leukocyte, lymphocyte, neutrophil, and platelet counts, and CRP level were higher than among RT-PCR-positive patients. Total leukocyte, lymphocyte, and platelet counts decreased, whereas CRP and LDH increased from non-suggestive to suggestive and/or highly suggestive CT. Patients experienced similar SpO2 values across CT and RT-PCR groups when the CT was suggestive or highly suggestive, while in patients with non-suggestive, CT SpO2 was higher, particularly for those who tested positive at RT-PCR.

Discussion

In a large sample of consecutive patients presenting to the ER for suspected pneumonia during the peak of the SARS-CoV-2 outbreak in Italy, we estimated CT sensitivity for COVID-19 pneumonia to be between 73 and 77% when adopting a high positivity threshold, which corresponded to a specificity of between 79 and 84%. When adopting a lower positivity threshold, CT sensitivity was between 90 and 94%, but specificity decreased to 58%. Nevertheless, given the very high prevalence of COVID-19 during the epidemic peak, the proportion of patients with SARS-CoV-2 infection among the CT-positive patients (PPV) was always equal to or higher than 90%, whatever the positivity threshold adopted.

Leukocyte, lymphocyte, neutrophil, and platelet counts decreased, and LDH increased both in RT-PCR-positive patients and in highly suggestive and/or suggestive CT scans. CRP decreased in RT-PCR-positive compared with RT-PCR-negative patients and increased in highly suggestive and suggestive CT scans. SpO2 was slightly lower in patients with suggestive and highly suggestive CT results. Nevertheless, as there is a large overlap of the distribution of values in RT-PCR-negative and positive patients, this study was not able to identify any blood test that could increase pre-test probability of COVID-19 pneumonia. It should be borne in mind that the population included in this study had 80% or more prevalence of RT-PCR-confirmed disease. Hence, it is very difficult to find subpopulations with substantially higher prevalence [20]. This further suggests that the data observed in high-prevalence populations should not be used to forecast PPV in low-probability populations. In fact, it is plausible that in low-prevalence settings, some of these blood tests could be used to select patients to be referred to CT, increasing the prevalence by a factor of two or three, an increase that cannot be observed in a high-prevalence population.

A recent study has assessed CT diagnostic performance in Italy by using only one threshold (positive versus negative CT scan) and two RT-PCR within 24 h as the reference standard in 158 patients. Results were similar to those obtained in our study with lower positivity threshold: sensitivity and specificity of 97% (95% CI, 88–99%) and 56% (95% CI, 45–66%), respectively [21].

A meta-analysis estimated a 94% sensitivity and a 37% specificity of CT scan for RT-PCR-confirmed COVID-19 [20]. In the largest study conducted in China on over 1000 patients, CT scan presented similar sensitivity (97%) but lower specificity and PPV (25% and 65%, respectively) [22]. Given the lower RT-PCR positive rate (59.2%) and high rate of RT-PCR-negative CT-positive patients who were eventually classified as probable COVID-19 according to their global clinical course (81%), it is plausible to think that the RT-PCR false negative rate was higher in the study by Ai et al, leading to underestimating CT specificity and PPV. Accordingly, in our study, when applying different reference standards so as to reduce RT-PCR false negative rates, specificity rose to 84% and PPV to 96% at the higher CT suspicion level.

Furthermore, in China, CT scan has been proposed and is used as a screening tool in asymptomatic or mildly symptomatic patients [14, 23], who are more likely to have low viral loads and false negative RT-PCR results [24]. Thus, the combination of an increase in clinical sensitivity of the reference standard, the application of CT only to symptomatic cases presenting in ER, and a greater knowledge of COVID-19 CT findings in the radiology community thanks to the pioneering studies from China may explain the higher CT specificity estimate in our study.

It is not surprising that in the outbreak peak phase, with very high disease incidence, the proportion of RT-PCR positive among non-suspicious CT was quite high, as the patient has not yet developed pneumonia or will never develop pneumonia even if infected. In fact, almost one half of these patients in our study had no CT abnormalities.

Limitations and strengths

The main limitation of this study, in common with all the others with a similar aim, is that CT and RT-PCR target two different conditions, e.g., COVID-19 pneumonia and SARS-CoV-2 infection; using one as the reference standard of the other thus introduces methodological challenges. Any RT-PCR false negative will result in an overestimation of CT false positives. Nevertheless, our estimate of RT-PCR sensitivity in this population is about 92.3%, consistent with the 89% estimate of a recent systematic review [20]. Several biases can affect this estimate because retesting of RT-PCR-negative CT-positive cases was not systematic in our study.

Radiologists were not blinded to clinical presentation and blood tests, which is common in real practice, but it should be considered if the results were applied to settings where this information is not yet available. Also, as a consequence of how the reporting was structured and of the need to be rapid and concise during the epidemic phase of the disease, we collected only some of the CT findings of COVID-19. This approach is surely less accurate than the retrospective review performed by experienced thoracic radiologists adopted by most studies, but it is more representative of the real-life diagnostic process.

Implications for practice

The results of this study are not intended to produce evidence to be generalized to all clinical scenarios. Nevertheless, we show high sensitivity and high positive predictive value of CT for COVID-19 pneumonia in the epidemic setting.

The specific phase during which the study was conducted reflects one of the scenarios proposed by the Fleischner Society [17], characterized by a high number of symptomatic patients presenting to the ERs, high pre-test probability, and unavailability of rapid virological testing.

Further studies are needed to assess specificity and PPV in lower prevalence settings. Also, a structured report for suspected COVID-19 patients may help in monitoring the proportion of positive results and the PPV of different positivity thresholds in different phases of the epidemic. Moreover, including CT from the pre-epidemic period [13] may help in assessing specificity to be projected in periods when COVID-19 could cyclically shift from endemic to epidemic phases.

Conclusions

In a high-prevalence setting (during the outbreak peak), CT presented a high PPV and may thus be considered a good reference to help clinicians to recognize and triage COVID-19 patients while waiting for RT-PCR diagnostic confirmation. Our results also confirm that in case of negative RT-PCR and highly suggestive CT findings, RT-PCR should be repeated; the patient should remain isolated, given the high probability of RT-PCR false negatives in this group.