Diagnostic performance of CT lung severity score and quantitative chest CT for stratification of COVID-19 patients

Purpose Lung severity score (LSS) and quantitative chest CT (QCCT) analysis could have a relevant impact to stratify patients affected by COVID-19 pneumonia at the hospital admission. The study aims to assess LSS and QCCT performances in severity stratification of COVID-19 patients. Materials and methods From April 19, 2020, until May 3, 2020, patients with chest CT suggestive for interstitial pneumonia and tested positive for COVID-19 were retrospectively enrolled and stratified for hospital admission as Group 1, 2 and 3 (home isolation, low intensive care and intensive care, respectively). For LSS, lungs were divided in 20 regions and visually assessed by two radiologists who scored for each region from non-lung involvement as 0, < 50% assigned as 1 and > 50% as 2. QCCT was performed with a dedicated software that extracts pulmonary involvement expressed in liters and percentage. LSS and QCCT were analyzed with ROC curve analysis to predict the performance of both methods. P values < 0.05 were considered statistically significant. Results Final population enrolled included 136 patients (87 males, mean age 66 ± 16), 19 patients in Group 1, 86 in Group 2 and 31 in Group 3. Significant differences for LSS were observed in almost all comparisons, especially in Group 1 vs 3 (AUC 0.850, P < 0,0001) and Group 1 + 2 vs 3 (AUC 0.783, P < 0,0001). QCCT showed significant results in almost all comparisons, especially between Group 1 vs 3 (AUC 0.869, P < 0,0001). LSS and QCCT comparison between Group 1 and Group 2 did not show significant differences. Conclusions LSS and QCCT could represent promising tools to stratify COVID-19 patient severity at the admission.


Introduction
Since severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been reported as cause of a new viral pneumonia coronavirus disease 2019  in Wuhan, Hubei, China, in December 2019, the next weeks spreading has led a pandemic diffusion of the virus all over the world, with over 58 million infected people, and the number is still increasing [1].
Awaiting specific treatments and vaccine effects, the best approach consists in early diagnosis, correct severity stratification and supportive therapy to allow a better prognosis in critical patients [2]. Some clinical and laboratory biomarkers are emerging as predicting tools to help physicians in the correct stratification of patients [3][4][5]. Among possible severity biomarker, chest computed tomography (CT) represents a valid noninvasive option [5][6][7][8]. Chest CT has shown high sensitivity (97%) despite low specificity (25-56%) for the diagnosis of COVID-19 [9,10]. In addition, chest CT allows the assessment of COVID-19 lung impairment, mostly represented by multiple and peripheral ground-glass opacities (GGO) and possible associated consolidations [10,11]. These alterations correlate with lung function in patients affected by acute respiratory disease syndrome (ARDS) [12], and COVID-19 could get worse until a severe lung injury with ARDS and need of intubation. Since now, timeliness in recognizing lung impairment is essential to change supportive therapy to have a better outcome [13].
Some authors proposed different lung severity scores [13][14][15][16] achieved by a visual assessment of the pulmonary impairment. Lung severity scores could have an impact in the clinical management to stratify patients and guide the clinical management at the admission and during follow-up [16,17] despite intrinsic limitations of visual assessment such as poor reproducibility in clinical contest and lack of standardization of different proposed methods.
To overcome visual assessment limitations, new imaging tools are emerging; an example is represented by deeplearning pulmonary quantification proposed by Huang et al. [18] or CT-aided quantification software [19]. Since now, a few studies have correlated quantitative lung impairment with clinical assessment and follow-up, but more studies are needed to confirm the reliability of lung quantification in clinical set.
Thus, the aim of our study is to assess the performance of chest CT in severity stratification of COVID-19 patients at the hospital admission and to compare diagnostic performances of lung severity score (LSS) and quantitative chest CT (QCCT).

Patient population and admission groups
This study was approved by our local institutional review board (IRB) and conducted in accordance with the Declaration of Helsinki. Informed consent was obtained from all patients, when patients were in a condition of inability their relatives or the admitting physicians provided it. Fourhundred-seven consecutive patients admitted at the Emergency Department of BLINDED with interstitial pneumonia from April 19, 2020, until May 3, 2020, were retrospectively included in the study.
Inclusion criteria were: (1) patients admitted to emergency department with suspicion of COVID-19, (2) patients with highly suspected chest CT for interstitial pneumonia, according to the main lung features linked to typical COVID-19 pneumonia [10,11]. Exclusion criteria were: (1) patients tested with swabs for SARS-CoV-2 detection and resulted negative, (2) patients who underwent chest CT with contrast medium injection, (3) who refused chest CT, (4) patients with history of lung malignancy that required pulmonary resection and (5) chest CT with deteriorated images from motion artifact.
According to the hospital internal protocol, when suspected COVID-19 patients were admitted presenting moderate-severe clinical features and a high pretest probability of disease (fever defined as > 37.5 °C and respiratory symptoms or direct contact with a confirmed COVID-19 patient) underwent nasopharyngeal and oropharyngeal swabs for SARS-CoV-2, and chest CT to assess lung impairment, chest CT was performed at the entrance, to have a real-time evaluation of lung parenchyma at baseline.
Every patient was tested with two nasopharyngeal and oropharyngeal swabs, the first swab at the entrance and the second after 24 h. The positivity to SARS-CoV-2 was obtained with reverse transcriptase-polymerase chain reaction (RT-PCR) (Charitè, Berlin, Germany) [20], while patients were considered SARS-CoV-2 negative after two consecutive negative RT-PCR results. For all included patients, demographic data and laboratory results were collected ( Table 1).
In addition, at the hospital admission, patients were clinically stratified for severity of symptoms and care necessity in home isolation or hospitalization (low intensive or intensive care) according to the guidelines of our hospital [21], then the population was divided in 3 Groups: Group 1 home isolation, Group 2 low intensive care and Group 3 intensive care.

CT acquisition technique
All suspected COVID-19 patients underwent chest CT to evaluate the presence of interstitial pneumonia. Chest CT was acquired without contrast medium and in supine position during end-inspiration. Each patient was studied using a COVID-19 dedicated 128-slice CT (GE Revolution EVO 64 Slice CT Scanner, GE Medical Systems, Milwaukee, WI, USA). CT scan technical parameters were as follows: tube voltage: 100 kV; tube current modulation 100-250 mAs; Asir-V 50%, spiral pitch factor: 0.98; collimation width: 0.625. Reconstruction images were performed with convolution kernel BONEPLUS at a slice thickness of 1.25 mm.

Lung severity score
A visual assessment of lung COVID-19 impairment was performed by two radiologists in consensus (GG and DC with 6 and 8 years of experience). According to lung severity score (LSS) already proposed in the literature [9], lungs were divided in 20 regions; the number of 20 was reached starting from the anatomical division in 18 lung segments, two of which were further divided in two regions: the anterior medial basal segment of the inferior left lobe was split in anterior basal and medial basal while the posterior apical segment of the superior left lobe was divided into apical and posterior regions.
To each segment, readers were assigned a visual percentage of parenchymal involvement (including GGO, consolidation and pleural effusion) scoring from a non-involvement expressed with 0, less than 50% of involvement assigned as 1 and a score of 2 was given for more than 50% of lung involvement. Thus, with the maximum score reachable of 40, readers made a visual assessment of all selected patients, on the reconstructed images, with possibility of multi-planar reconstruction and a fixed window level set for lung (WW/ LL:1600/-600 HU).

Quantitative chest CT
Two radiologists in consensus (FP and MP with 5 and 4 years of experience), blinded to clinical patients' stratification, performed QCCT analysis by using a dedicated software (Thoracic VCAR v13.1, GE). Before segmentation, attenuation value < − 1000 HU was used to exclude trachea air from the analysis. Quantitative analysis was performed on naive acquisition using a lung window with a width of 1500 HU and a level of − 600 HU, in particular the selection of well-aerated lung was performed by using a range between − 950 and − 700 HU density [22][23][24]. The software automatically calculated the following features: GGO, consolidation, fibrotic-like alterations (including fibrotic-like streaks and subpleural lines), total lung impairment and healthy lung, using an adaptive mean based on gray scale, expressed in percentages. Vessel was automatically selected and delated. In case of non-adequate automatic segmentation, readers were free to adjust the area of lung impairment segmented by the software [25].

Statistical analysis
Statistical analysis was performed using MedCalc Statistical Software version 17.9.7 (MedCalc Software bvba, Ostend, Belgium), and P values < 0.05 were considered statistically significant. All data are expressed as mean ± standard deviation (SD). Kolmogorov-Smirnov test was used to assess data distribution. In case of Gaussian distribution, data were tested with Student's t test, while Wilcoxon test was applied for non-Gaussian distributed data. LSS and QCCT were analyzed with receiver operating characteristic (ROC) curves and the area under the curve (AUC) was calculated for predicting the performance of both methods for distinguishing clinical stratified patients at the hospital admission Group 1, 2 and 3. Further sub-analysis was performed comparing performance of LSS and QCCT for Group 1 and together Group 2 + 3 and between Group 1 + 2 against Group 3. For LSS, the interobserver agreement was also evaluated.

Patient population
According to exclusion criteria, from the initial population of 407 patients, were excluded: 142 patients due to negative chest CT for interstitial pneumonia, 103 patients negative for SARS-CoV-2 swabs, 12 patients who underwent chest CT with contrast medium injection and 14 patients for motion artifacts on chest CT (Fig. 1). Final population enrolled included 136 patients, 87 male 49 female (mean age 66 ± 16, range 28-97); in particular, 19 patients belonged to admission Group 1, 86 were in Group 2, while 31 were admitted in Group 3. Full clinical data divided per admission groups are displayed in Table 1.
Significant differences were observed in C-reactive protein and D-dimer between Group 1 and 3 with P = 0.0058 and P = 0.0005; lactic acid dehydrogenase and D-dimer had significant differences between Group 2 and Group 3 with P = 0.0006 and P = 0.0007, respectively. Group 1 + 2 compared to Group 3 showed significant differences for C-reactive protein, lactic acid dehydrogenase and D-dimer with P = 0.0006, P = 0.0005 and P = 0.0001, respectively. No significant differences among other clinical parameters were observed.

Discussion
Our study tested the performance of chest CT lung severity score, based on the evaluation of the lung parenchyma involvement expressed in terms of percentage without a specific analysis concerning the type of alterations (i.e., consolidation, pleural effusion and GGO) and quantitative chest CT, performed by using a dedicated software which semiautomatically quantified each parenchymal changes, in differentiating COVID-19 patients at the admission into three different groups: home isolation (Group 1), low intensive care (Group 2) and intensive care (Group 3). Both LSS and QCCT showed significant and good performance in stratifying the severity of COVID-19 patients at the admission at the Emergency Department, in particular these helped in the identification of Group 3, the intensive care patients. The best diagnostic performance, in terms of AUC obtained, for discriminating COVID-19 patients in home isolation (Group 1) vs COVID-19 patients in intensive care (Group 3) was reached by lung severity score with an AUC of 0.850 (P < 0.0001) and by quantitative chest CT for consolidations and fibrotic-like alterations with AUC of 0.869 and 0.842 (all P < 0.0001), respectively. LSS results were also supported by the evaluation of interobserver agreement, that was excellent. We did not perform a qualitative score for each type of alterations to reduce the bias, which it is usually correlated to visual assessment. Furthermore, in the analysis of clinical laboratory data, we obtained some consistent differences into several comparisons made, showing the higher value of C-reactive protein, D-dimer, lactic acid dehydrogenase in the patients with high-risk diseases, needed low and intensive care. Similar results were obtained for discriminating COVID-19 patients in low intensive care (Group 2) vs COVID-19 patients in intensive care (Group 3) for both LSS (AUC 0.768, P < 0.0001) and QCCT in terms of consolidation (AUC 0.794, P < 0.0001) and total lung impairment (AUC 0.790, P < 0.0001). Grouping both COVID-19 patients in home isolation and low intensive care (Group 1 + 2) vs Group 3, LSS returned with AUC of 0.783 (P < 0.0001) while QCCT for consolidation and total lung impairment showed an AUC of 0.808 and 0.798, respectively (all P < 0.0001). Interestingly, no differences were obtained between Group 1 and 2 for both LSS and QCCT.
The first general consideration regards the similar performance of both methods to identify COVID-19 patients in intensive care against Group 1 and 2, and Group 1 + 2, despite theoretically a semi-automated quantification seems more accurate than a visual one; similar results were obtained by Cong S. and colleagues [19] in terms of correlation between lesion percentage scored by radiologists and the computer software.
Despite differences among a wide diversity of lung severity scores [14,16,19,26], our lung severity score results are in line with the others present in the literature, expressing a similar trend in terms of higher lung impairment observed with worsening of COVID-19 clinical conditions. More comparable results can be made with Yang et al. [16] due to the similarity of score adopted: their score achieved an AUC of 0.892 with a sensitivity and a specificity of 83% and 94%, respectively, and a cutoff > 19.5 to differentiate mild cases from severe cases. Our sub-analysis of Group 1 compared with Group 2 + 3 showed a cutoff > 10 for the LSS with an AUC of 0.668, a sensitivity of 80.34% and a specificity of 52.63%; on the contrary similar cutoff was observed for the Group 2 vs Group 3 and for the sub-analysis Group 1 + 2 compared with Group 3. Some discrepancies about cutoff values can be explained with different clinical stratification guidelines at the Emergency Department. However, the intrinsic limitation of visual assessment and consequent difficulties in the comparison among lung severity scores cannot be disregarded.
On the other hand, QCCT with percentage quantification extracted through the aided semi-automated method, let us make some interesting considerations. Consolidations, fibrotic-like alterations and total lung impairment resulted in the most significant parameters in terms of performance achieved with ROC curves among the different groups comparison.
Our results showed higher prevalence of consolidations impairment in more severe patients; this aspect can be explained with the progression of lung injuries due to increasing infiltration of both pulmonary parenchyma and interstitial spaces, caused by alveolar inflammatory exudation, diffuse alveolar damage and necrotizing bronchitis due to viral invasion and inflammatory system reaction [27,28]. Likewise, higher percentage of Total Lung Impairment can be explained with the concomitant different lung injuries in severe patients such as GGO, crazy paving, consolidations and fibrotic-like alterations that concur to increase the total amount of lung injuries [19,27,29,30]. Conversely, data regarding fibrotic-like alterations are less in accordance with other studies [26,27]; in fact, Ding and colleagues [26] observed on a cohort of 112 patients, that linear opacities were more frequent in patients during stage 4 and 5 corresponding to 15-28 days after the beginning of symptoms. Also Lyu et al. define pulmonary fibrosis as uncommon CT findings at the baseline chest CT for all groups analyzed divided for disease severity [27]. A possible explanation is that patients admitted at the hospital had pneumonia symptoms some days before they arrived at the Emergency Department and when they underwent chest CT, some of the alterations, such as secondary organizing pneumonia, might be compatible with a medium or late stage of pneumonia [31]. Interestingly, GGO percentage does not have a great performance to stratify patients, in accordance with Lyu P. and colleagues findings [27]. Finally, percentage healthy lung parenchyma cutoff between Group 1 in comparison with Group 3 (cutoff < 72%, AUC 0.83) is in accordance with the cutoff showed by Colombi D. and colleagues performed on 236 patients (cutoff < 71%, AUC 0.86) [23].
Moreover, our consistent differences in laboratory data between Group 1 and Groups 2-3 are in consensus with the previous study of Watanabe et al. [5], in which were demonstrated higher values of inflammation makers in patients needed intensive care. Then, inflammation markers could Fig. 3 a Chest CT scan of a 69-year-old male patient followed at home isolation (Group 1) and b segmented in red corresponding quantitative chest CT reporting percentage of ground-glass opacities (9%), consolidations (1,19%), fibroticlike alterations (0,85%), total lung impairment (11,06%) and healthy parenchyma (88,11%), while vessels in yellow were excluded from the semiautomatic analysis; patient's lung severity score was evaluated with a score of 16. c A chest CT scan of an 83-year-old male patient in intensive care (Group 3) with the following quantitative chest CT showed in d: ground-glass opacities (30,42%), consolidations (2,07%), fibrotic-like alterations (3,19%), total lung impairment (35,69%) and healthy parenchyma (62,18%); patient's lung severity score was 28 be associated with consistent parenchymal impairment, in a setting of cytokines storm.
Limitations of our study include the retrospective nature of it, the lack of clinical and radiological follow-up, absence of a combined clinical and radiological model for the patient's stratification and the choice to use a CT-aided program for lung quantification instead of deep-learning model, lack of comparison between visual and quantitative score, patients enrolled in an early pandemic scenario in which the severity was extraordinarily high and it could represent a patient selection bias, lack of LSS interobserver agreement. In the future, we want to overcome these drawbacks with the aim to perform an analysis on a more heterogeneous population, by using a more consistent quantification software, and with some data of follow-up.
In conclusion, our study demonstrates the feasibility of both chest CT lung severity score and quantitative chest CT as tools to stratify COVID-19 patients severity at the Emergency Department admission; quantitative chest CT might be integrated with clinical parameters to help accurate triaging of COVID-19 patients.
Authors' contributions DC contributed to supervision, conceptualization, formal analysis and writing-review and editing. MZ contributed to data curation, writing-original draft and methodology. MP contributed to formal analysis and writing. FP contributed to writing-review and editing and formal analysis. GG and GT contributed to software and formal analysis. TP contributed to software and validation. CR contributed to formal analysis, data curation and writing. BB contributed to data curation, software and writing. AL contributed to project administration, writing-review and editing, supervision and validation.

Funding
The authors did not receive support from any organization for the submitted work.
Data and material availability The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code availability Texture analysis was performed by using a dedicated software (TexRAD, Feedback Medical Ltd., Cambridge, UK). Statistical analysis was performed using MedCalc Statistical Software version 17.9.7 (MedCalc Software bvba, Ostend, Belgium).

Conflict of interest
The authors have no relevant financial or non-financial interests to disclose. Consent to participate Informed consent was obtained from all individual participants included in the study. Consent for publication The authors affirm that human research participants provided informed consent for publication of the images in Fig. 3.

Ethics approval
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by our local institutional review board (IRB).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.