Introduction

By the end of 2019, a novel type of coronavirus, known as SARS-CoV-2, was discovered, causing several infections and pneumonia cases initially in Wuhan, China, and later on across the globe. The World Health Organization (WHO) defined the acute infectious disease caused by the SARS-CoV-2 as COVID-19 (Coronavirus Disease - 2019) [1, 2]. COVID-19 is a systemic infectious disease but mainly characterized by the inflammation of the human respiratory system and its high contagiousness. Currently, the diagnosis of COVID-19 is confirmed by real-time reverse transcription-polymerase chain reaction (RT-PCR) [3, 4]. However, the virus presence in the upper airways is transient, and the RT-PCR displays low sensitivity of 71% and requires dedicated instrumentation to be readily available, limiting its wide use during a pandemic [5, 6].

Most of the patients with suspected pneumonia are submitted to chest radiography (XR) and computed tomography (CT) to assess the infiltrates’ presence and patterns. The infection caused by COVID-19 typically presents bilateral lung infiltrates with patterns resembling ground-glass and consolidation [3, 6, 7]. These radiological characteristics are informative but are similar to signs from other acute respiratory syndromes like SARS (Severe Acute Respiratory Syndrome) and MERS (Middle East Respiratory Syndrome) [8]. Moreover, as those characteristics describe the internal structure of lung lesions subjectively, qualitatively, or semi-quantitatively, they can lead to intra- and inter-observer variability [9, 10]. Due to the limitations of the aforementioned methods (i.e., RT-PCR and visual/qualitative radiological assessment), a quantitative/computational approach may add to clinical routine.

Computer-aided diagnosis/detection (CAD) tools contribute to improve the interpretation of radiological findings and to identify diseases in early stages [11, 12]. The goal of CAD is to improve the accuracy and consistency of medical image diagnosis and interpretation using the suggestion provided by a computer. CAD tools traditionally provide a single answer (second opinion) to specialists, but not short-term prognostic information, limiting the applicability to the clinical routine [13, 14].

The field of radiomics has emerged as a promising quantitative approach to develop medical imaging biomarkers and support clinical decisions [12, 15] Radiomics is an extension of CAD that associates computer-extracted medical image features with clinical endpoints (e.g., genomics, staging, survival, recurrence, among others). This radiomic association allows a more comprehensive characterization of the underlying phenotype, ultimately increasing the power of decision support models [14, 16]. The recent advances in target therapies for precision medicine imperatively required an inexpensive and easily obtainable imaging approach for phenotyping diseases, and radiomics can provide it as it is a non-invasive, fast, low cost, and reproducible tool [9, 15]

Therefore, our goal in this work is to use radiomics to identify XR quantitative imaging biomarkers for COVID-19. For this purpose, we first segmented the lungs automatically from radiography images; then extracted quantitative features from the segmented regions of interest (ROIs); and finally associated them with COVID-19 endpoints, such as etiology and patient survival.

Materials and Methods

Patients

In this study, we used XR images of 227 patients from publicly available cohorts, and hence, no institutional review board approval was needed. At first, we used three cohorts to discover potential biomarkers for COVID-19, namely the discovery set, and the other two cohorts as an independent validation set.

The discovery set was composed of 195 patients:

  1. (a)

    29 patients with COVID-19 that had images and clinical data provided by the Italian Society of Medical and Interventional Radiology [17];

  2. (b)

    127 cases of pneumonia non-related to COVID-19 from the Spanish chest XR cohort (PadChest dataset) with image-associated reports from patients that attended the San Juan de Alicante Hospital, University of Alicante, Spain [18];

  3. (c)

    39 patients with pneumonia non-related to COVID-19 from the National Library of Medicine, National Institutes of Health (OpenI dataset), who attended various hospitals of the Indiana University School of Medicine, USA [19].

Physicians performed image labeling for the cases from PadChest and OpenI datasets. The validation set was composed of 32 patients from around the world. In this set, images and clinical data were initially collected from the literature by researchers of the University of Montreal [20]. Then, it was completed with follow-up data (survival time, imaging and event dates) by us to perform the radiomic analysis. From those 32 patients, 20 had the diagnosis confirmed for COVID-19 by RT-PCR, and 12 from a different etiology distributed as following: one caused by Pneumocystis fungal pathogen, two by the bacteria Streptococcus, four of them had ARDS, and five had SARS. Table 1 describes demographic data from the cohorts.

Table 1 Description of the patients

Image Segmentation

We first automatically segmented the lung from the radiography using an algorithm based on an artificial intelligence model (i.e., convolutional neural network U-Net) [21, 22]. Although the model was previously trained and assessed for lung segmentation, yielding a performance (Dice coefficient) of 0.978 [22], we evaluated the performance of the model with the images from this work to enable a robust radiomic analysis. Two experienced medical image analysts (one with 12 years of experience in CT and XR imaging and one with 9 years of experience in XR, intravascular optical coherence tomography, and ultrasound imaging) manually segmented the lungs to be used as reference for the images segmented by the model. The Dice coefficient and the Jaccard index obtained from the automatic over the manual segmentation of all images were, respectively, 0.951 (± 0.031 of standard deviation) and 0.909 (± 0.053 of standard deviation).

The image segmentation algorithm created a binary mask of both lungs and then a ROI from the lungs mask extreme points to generate the segmented image. This segmentation step removed unnecessary anatomical structures for pneumonia assessment, such as head, neck, and arms, along with textual information relative to the exam (such as DICOM metadata).

Feature Extraction

Quantitative radiography-based features on all imaging levels (first order, second order, and higher order) were investigated as potential biomarkers for COVID-19. These radiomic features were extracted on each image segmented by the method described previously. The first-order features describe the gray-level distribution of an image without considering pixel locations. Second-order features describe the spatial relationships of gray levels inside the ROI. On the other hand, higher-order features simultaneously evaluate location and relationships between pixels without considering spatial properties by using image filtering [9, 11, 14]

For each patient, first-order features were extracted from the gray-level histogram of the segmented radiography (First-order histogram). Eighteen statistical measures were calculated from each image histogram. The second-order features were extracted from five gray-level matrices (Second-order texture): co-occurrence matrix (GLCM, 24 features), run-length matrix (GLRLM, 16 features), size zone matrix (GLSZM, 16 features), dependence matrix (GLDM, 14 features), and neighboring gray-tone difference matrix (NGTDM, 5 features). Higher-order features were obtained from wavelet transforms and a square filter (higher-order spectrum). Coiflet transforms were applied to decompose the image in four different frequency domain bands (HH, HL, LH, and LL). The square filter took the gray levels square and linearly scaled them back to the radiography’s original range. After filtering, the first- and second-order measures were calculated on the filtered image’s histogram or matrix. The radiomic features were extracted using PyRadiomics v3.0 package, in compliance with the Imaging Biomarker Standardization Initiative (IBSI) [16, 23]. A total of 558 radiomic features characterized each patient comprised in the analysis (Table 2).

Table 2 List of all features extracted for the radiomic analysis

Statistical Analysis

A univariate analysis statistically evaluated the radiomic association between XR features and COVID-19 diagnosis, using the receiver operating characteristic (ROC) curve with sensitivity and specificity metrics. The Mann–Whitney U test evaluated the statistical difference between feature distributions from the groups of patients with pneumonia [9]. Each feature had the area under the ROC curve (AUC) and p-value calculated individually.

The short-term prognostic analysis was performed by correlating the radiomic features with overall and deterioration-free survival using the Kaplan–Meier time-to-event method. Higher and lower-risk groups of patients were split according to the median value of the quantitative features [12]. As the number of patients with follow-up data (survival time and outcome result) was relatively low for this analysis, we combined all cases with COVID-19 in a single set of 28 patients (14 from discovery and 14 from the validation set).

The mean follow-up time was 20.4 days (±7.1 of standard deviation). Overall survival analysis used death by any nature as event, and deterioration-free survival analysis used worsening on clinical/radiological conditions or death by any cause. Patients who survived or remained clinically stable or had loss of follow-up were censored. The log-rank test assessed the statistical difference between the survival curves from both stratified groups to identify features with potential prognostic value [15].

The SciPy v1.2.3 and R v3.4.4 packages were used to perform statistical analysis. Tests with p < 0.05 were considered statistically significant.

Results

Demographic Findings

From the 49 patients later diagnosed with COVID-19 (29 from discovery and 20 from validation set), 44 patients displayed clinical data publicly available for analysis. All 44 patients attended a hospital after onset, mainly with fever (52% of the cases with symptoms data available), cough (27%), and dyspnea (25%). Twenty-six patients were men with a mean age of 53.1 years old (±16.2 of standard deviation), and 18 were women with a mean age of 61.4 years old (±12.1 of standard deviation). We divided the sample into two datasets for discovery and independent validation sets (Table 1). Figure 1 depicts the radiomic analysis performed in this study.

Fig. 1
figure 1

Workflow employed in this work: (a) radiomic pipeline for the association between radiographic features and COVID-19 endpoints; (b) radiomic analysis performed to identify potential biomarkers for the diagnosis of COVID-19

Diagnostic Biomarker Findings

Statistical analysis identified 176 radiomic features associated with COVID-19 in the Spanish discovery set. Seventy-nine of those obtained a significant correlation with SARS-CoV-2 in the validation set (p < 0.05). Moreover, we identified 243 radiomic features associated with COVID-19 in the American discovery set. Fifty-three of those also obtained a significant correlation with the novel coronavirus in the validation set (p < 0.05).

The intersection set between the features identified in the validation set resulted in 51 radiomic biomarkers for COVID-19 (Fig. 1b). Figure 2 shows the most significant radiomic biomarkers for COVID-19 and their respective AUC value. Most of them were higher-order features extracted after the wavelet (HH band) transform (41%). Figure 3 presents boxplots of the values from some features associated with pneumonia caused by the SARS-CoV-2 virus. One wavelet feature identified as f521, according to the supplementary material, obtained the best association with COVID-19. The feature f521 yielded an AUC of 0.867, sensitivity of 0.85, and specificity of 0.67 (Fig. 4). The significant features identified only in the American discovery set were f56, yielding an AUC of 0.775, and f54 with an AUC of 0.742. The most significant features identified only in the Spanish discovery set were f74, f76, and f246 with AUC of 0.821.

Fig. 2
figure 2

Most significant radiomic biomarkers for COVID-19. In the end of each feature name, there is a statistical significance symbol used according to the following notation: *** for p < 0.001, ** for 0.001 \(\le\) p < 0.01, and * for 0.01 \(\le\) p < 0.05

Fig. 3
figure 3

Distribution of some significant radiomic features associated with COVID-19. The dashed line depicts the mean value of the feature for the corresponding group

Fig. 4
figure 4

Performance of the feature f521 to recognize COVID-19 radiographic patterns: (a) ROC curve; (b) true positive XR of a 40-year-old woman with COVID-19 presented as a very discrete ground-glass opacity in the right lower lobe; (c) false negative XR of a 50-year-old woman with COVID-19 presented as multiple small bilateral patchy opacifications

Short-term Prognostic Biomarker Findings

There was a small number of patients with follow-up data for time-to-event analyses, which allowed to combine all patients with COVID-19 in a single set (n = 28). One radiomic feature was identified with short-term prognostic value to predict overall survival. The feature f287 yielded a significant difference in overall survival rates from the stratified risk groups of COVID-19 patients. The normalized threshold value used for stratification was 0.177509. High values (greater than the median) of the potential biomarker identified lower-risk patients with a mean survival time of 25 days. This group was composed of six women (58.5 ±14.2 years) and seven males (51.6 ±9.6 years). One patient did not have all the clinical data available. Furthermore, low values (less than the median) of the feature stratified patients with higher risk and mean survival time of 13 days. This group included four women (65.5 ±13.0 years) and ten males (61.2 ±11.7 years).

Figure 5 presents two examples of COVID-19 patients stratified by the radiomic feature f287 identified with prognostic potential. It is important to emphasize that both patients from Fig. 5 were correctly classified as COVID-19 cases by the radiomic biomarker f521, which yielded the highest performance to detect COVID-19, as previously reported.

Fig. 5
figure 5

Radiography image, gray-level histogram, and tridimensional surface plot of COVID-19 patients stratified by the radiomic biomarker f287: (a) 67-year-old woman with bilateral consolidation and 13 days of survival (no occurrence of an event of death on follow-up), classified as a lower-risk case by the biomarker; (b) 36-year-old man with scattered consolidation and nine days of survival until death, classified as a higher-risk case by the biomarker. Although both cases look visually very similar, as described by radiological assessment and gray-level distributions, the higher-order radiomic biomarker could stratify the risk of the patient according to spectral properties of the radiographic image

Five radiomic features (f143, f174, f294, f340, and f344) presented prognostic value to predict patient deterioration-free survival (Table 3). The biomarker f174 yielded the highest significant difference in survival curves, using the normalized median of 0.328535 as the threshold for stratification. High values of the feature identified lower-risk patients, while low values stratified patients with a higher risk of deterioration (12 events with mean deterioration-free survival time of 4.8 days and hazard ratio of 3.2). The lower risk patient group included five women (57.4 ±13.4 years) and nine males (55.3 ±13.1 years). The higher risk patient group included five women (65.2 ±13.8 years) and eight males (57.6 ±10.4 years). One patient did not have all the clinical data available. Figure 6 presents the Kaplan–Meier curves of risk groups of COVID-19 patients identified by the most significant radiography-based radiomic feature.

Table 3 Radiomic features associated with deterioration-free survival of COVID-19 patients
Fig. 6
figure 6

Kaplan–Meier deterioration-free survival curves of COVID-19 patients stratified by the radiomic biomarker f174

Discussion

In this study, we provided evidence that 57 radiomic features from chest radiographs can improve diagnostics’ specificity and determine the worst outcome in the short-term in COVID-19 patients. The early diagnosis of COVID-19 is crucial for the patient’s isolation to prevent virus spread and for rapid treatment decisions to improve the patient’s short-term prognosis [24]. Medical imaging plays a critical role in evaluating COVID-19, mainly on staging the disease’s extent and monitoring the progression after treatment (dexamethasone to critically ill patients on ventilators, for instance) [4, 25, 26]. In early stages, multiple small patchy shadows and interstitial changes emerge in the lungs, while in severe stages, the lesions aggravate, leading to massive infiltrating consolidations and ground-glass opacities, ultimately changing into fibrosis in a dissipative phase [2, 27].

Most of those radiological characteristics are subjectively evaluated with CT imaging as it has higher sensitivity on visual assessment than radiography [5, 7]. However, XR is more accessible and exposes the patient to less radiation. Therefore, chest radiographic biomarkers may have a significant impact on supporting clinical decisions. It is widely known that COVID-19 has a worse prognosis in older people and patients with chronic comorbidities (e.g., hypertension, diabetes, and cardiac diseases) due to their weaker immune system [1, 4].

In this work, we identified XR features associated with COVID-19 that can stratify the patient’s short-term risk even without comorbidity conditions and at an early stage of care (at hospital admission, for instance). These features could indicate the patient’s rapid worsening before the clinical condition deteriorates and when treatment is more likely to have greater benefit. Moreover, f287 and f174 identified patients at a higher/lower risk, confirming the worse short-term prognosis to men in comparison to women [28].

The COVID-19-correlated features of f521, f287, and f174 highlights the challenge of visually recognizing intricate XR patterns, as they were uncovered only after wavelet transform or square filtering. Thus, the radiomics of COVID-19 only identified the biomarkers from a higher-order imaging level with frequency domain analysis. These higher-order features traditionally describe different properties of spectral components from a ROI, characterizing image heterogeneity [13, 15], but the wavelet transforms enabled to capture higher textural heterogeneity on radiography from COVID-19 and not from other pneumonia etiologies.

Radiomic models have previously been developed to improve chest radiographic assessment of pneumonia cases. Sousa et al. [29] used wavelet-derived features as input to three different multivariate methods to detect childhood pneumonia. Chandra et al. [30] employed five different artificial intelligence techniques with first-order histogram features to detect adult pneumonia in XR. Deep-learning models have also been used to detect pediatric pneumonia [31, 32], but all of them were done prior to the COVID-19.

The small sampling of the cohorts precludes our findings’ generalization, which will require validation in future studies. That will be facilitated by the fact that the pandemic has not disappeared, and soon a large number of images will be available in the public domain to validate/improve these findings. Moreover, as sharing data policy is being heavily stimulated, we expect to access clinical data to enhance the number of candidate biomarkers for COVID-19. A prospective evaluation of the biomarkers will also be necessary to confirm the differences in texture and spectrum of images from similar visually identical radiological assessments. Further validation of these biomarkers may also be instrumental in teleradiology to reduce the gap from distant resource-limited places, where x-ray scanners are the only imaging healthcare option to assist diagnostics and predict outcomes of COVID-19 patients.

Conclusion

Altogether, we identified 57 radiomic biomarkers (51 diagnostics and 6 prognostics, p < 0.05) correlated with the etiologic agent of acute infectious diseases and short-term outcomes from COVID-19 patients. The biomarkers of f521, f287, and f174 have the potential to improve the clinical routine as it could automatically prioritize the exams from higher-risk patients with COVID-19 for further reading from a specialist and investigation. The feature f521 specifically distinguished pneumonia etiologies, which could be used as an initial biomarker to stratify the early identification of COVID-19. Finally, the biomarkers of f521, f287, and f174 could recommend the need for intensive care with mechanical ventilation, for instance, ultimately leading to better outcomes such as decreased deterioration and mortality.

Finally, it will be paramount to test the usefulness of these radiomics to predict or anticipate the critical cases, especially the ones requiring attention to thrombotic events.