Patient population and groups
Our institutional review board approved this retrospective study, and the requirement for informed consent was waived.
We searched the electronic database at two institutions ((I) Renmin Hospital of Wuhan University, (II) Central Hospital of Wuhan optical valley) and retrospectively reviewed records for patients between January 30, 2020, and April 30, 2020, and identified 449 patients infected with COVID-19. All patients with COVID-19 were proven using real-time reverse transcriptase–polymerase chain reaction. The whole pipeline of our study was shown in Fig. 1.
According to the “Diagnosis and Treatment Program of Pneumonia of New Coronavirus Infection (Trial 7th Edition)” recommended by China’s National Health Commission, all patients are classified as having minimal, ordinary, severe, and critical type [9]. In our study, ordinary cases were included in the non-severe disease (NSD) group, while severe and critical cases were merged into the severe disease (SD) group. All patients in the SD group should meet any of the following criteria: (1) respiratory rate ≥ 30 breaths per minute; (2) finger of oxygen saturation ≤ 93% in a resting state; (3) arterial oxygen tension (PaO2)/inspiratory oxygen fraction (FiO2) ≤ 300 mmHg (1 mmHg = 0.133 kPa); (4) respiratory failure occurred and mechanical ventilation required; (5) shock; (6) other organ failure needing intensive care unit (ICU) monitoring treatment.
Image acquisition and lesion segmentation
Non-enhanced chest CT scans of 316 patients were carried out from the lung apex to the lung base using multi-detector CT (MDCT) scanners (Brightspeed CT or Optima 680 CT, GE Healthcare) at the end of inspiration. Breath-hold training was carried out before each examination. Parameters for chest CT scanning were listed as follows: field of view (FOV), 36 cm; tube voltage, 120 kV; tube current adjusted automatically; noise index, 13; section thickness, 5 mm; slice interval, 5 mm; pitch, 1.375; collimation 64 × 0.625 mm; gantry rotation speed, 0.7 s; matrix, 512 × 512; the mediastinal window: window width of 200 HU with a window level of 35 HU, and the lung window: window width of 1500 HU with a window level of - 700 HU.
All images were segmented on the commercial segmentation software (Lung Intelligence Kit 2.1, LK 2.1, GE Healthcare) [10]. First, pre-processing was executed and included the following steps: resampling adjust the x-spacing, y-spacing, z-spacing size (spatial resolution = 1 mm × 1 mm × 1 mm). Gaussian filter with a standard deviation of 0.5 was applied for signal smoothing. Then, the lung was automatically segmented into five lobes; the volumes of interest (VOIs) were automatically contoured for each lobe. The segmentation results are manually corrected by a radiologist (Z.K., 2 years of experience in radiology) and then confirmed by another radiologist (F.Z., 5 years of experience in radiology). CTLP was defined as the volume of the lesions (including ground-glass opacity (GGO), consolidation, and reticulation) divided by the volume of the entire lung. CTSS was used to estimate pulmonary involvement of all abnormalities on the basis of the area involved [11]. Each of the five lung lobes was visually scored from 0 to 5 as follows: 0, no involvement; 1, less than 5% involvement; 2, less than 25% involvement; 3, 26–49% involvement; 4, 50–75% involvement; or 5, more than 75% involvement [12].
Collection of clinical data and evaluation of CT radiological features
Clinical data were recorded, including the following 10 characteristics: age, sex, duration of onset, comorbidity, clinical type, treatments, respiratory support strategies (RSS), ICU admission, length of ICU stay, and length of hospital stay. We also recorded other clinical parameters, such as oxygen saturation and temperature, but which were not used in the clinic’s model, because the former was one of the known diagnostic criteria for severe COVID-19, and the latter was a variable parameter due to uncertain medical history. Comorbidities included hypertension, diabetes, chronic obstructive pulmonary disease (COPD), chronic kidney disease (CKD), malignant tumors, and surgery history (on any part of the body in the last 10 years). The number of comorbidities was from 0 to 4; 0 means no complications, and 4 means there are 4 kinds of diseases. Treatments included antiviral therapy, antibiotic therapy, glucocorticoid therapy, immunoglobulin therapy, and Chinese medicine therapy. RSS included nasal catheter, high-flow nasal cannula oxygen therapy, non-invasive mechanical ventilation, invasive mechanical ventilation, and extracorporeal membrane oxygenation (ECMO).
All CT images were evaluated by 2 radiologists (Z.K. and L.L.) who were blinded to each subject’s clinical data. For disagreement between the two primary radiologist interpretations, a third experienced thoracic radiologist with 25 years of experience (Y.Z.) adjudicated a final decision. Ten CT radiological features were assessed, namely GGO, consolidation, reticular pattern, interlobular septal thickening, air bronchogram sign, lesion location, distribution, involved lobe, thickening of pleura, and pleural effusion.
Radiomics feature extraction
A total of 851 radiomics features were extracted from the VOIs segmented based on the L. K software, including first-order statistics parameters (n = 18), morphological parameters (n = 14), gray-level co-occurrence matrix (GLCM) parameters (n = 24), gray-level run length matrix (GLRLM) parameters (n = 16), gray-level size zone matrix (GLSZM) parameters (n = 16), gray-level dependence matrix (GLDM) parameters (n = 14), neighboring gray tone difference matrix (NGTDM) parameters (n = 5), and wavelet parameters (n = 744). All the features defined were in compliance with feature definitions as described by the Imaging Biomarker Standardization Initiative (IBSI) [13]. The detailed workflow of radiomics analysis can be found in Fig. 2. Intra- and interclass correlation coefficients (ICC) were used to assess the intra- and inter-observer reproducibility of radiomics feature extraction.
Radiomics features selection and radiomics signature construction
The outlier values were replaced by the median value of the particular variance vector once the values were beyond the range of the mean and standard deviation. And standardization was performed to scale the data in a specific interval. Spearman correlation, generalized linear model (GLM), and least absolute shrinkage and selection operator (LASSO) were used to reduce the redundancy or selection bias of the features, thereby removing a high correlation. A radiomics score (Rad-score) was calculated for each patient via a linear combination of selected features that were weighted by their respective coefficients.
Development of predictive models
The most significant features were investigated to construct radiomics model based on logistic regression. The likelihood ratio test with backward step-down selection was applied to the multivariate logistic regression model. We grouped the selected features into seven models—the radiomics model (radiomics features), the CTSS model (semi-quantitative CTSS), CTLP model (quantitative CTLP), the clinical model (clinical features), the integrated A model (CTSS + CTLP + clinical features), the integrated B model (clinical features + radiomics features), and the integrated C model (radiomics features + CTSS + CTLP + clinical features). The calibration curves were used to investigate the performance characteristics of the nomograms.
Statistical analysis
Statistical analyses were performed with the Institute of Precision Medicine Statistics (IPMs, version 1.1, GE Healthcare). The differences in all variables between NSD and SD groups were assessed using the Mann-Whitney U test or independent samples t test for continuous variables, and the chi-square test or Fisher’s exact test for categorical variables. Univariate analysis was used to estimate the relationship between clinical factors and the identification of the two subtypes. The performances of the seven models were assessed by area under the receiver operating characteristic curve (AUC), specificity, and sensitivity. The optimal cut-off points to predict the severity of COVID-19 were determined by Youden’s index. The DeLong test was used for pairwise comparisons among the seven models using the R software (v. 3.6.0; http://www.Rproject.org). A two-sided p < 0.05 was considered statistically significant throughout the study.