Introduction

With the advancement of precision medicine, molecular targeted therapy has been widely used in the treatment of lung cancer. Several studies have shown that the epidermal growth factor receptor (EGFR) mutation status provides the conditions for individualized therapy in lung adenocarcinoma patients [1,2,3,4]. EGFR-mutant patients treated with the EGFR-tyrosine kinase inhibitor (EGFR-TKI) achieve longer progression-free survival and better response rates than conventional chemotherapy [5, 6]. Therefore, the National Comprehensive Cancer Network guidelines recommend routine detection of EGFR mutations to guide molecularly targeted therapy for lung adenocarcinoma patients [7].

Conventional identification of EGFR mutations requires biopsy and genetic testing which has several limitations in clinical practice: (1) the potential risk of tumor metastasis during biopsy; (2) the difficulty in obtaining representative tumor tissue due to tumor genetic heterogeneity; (3) not all tumors of all sizes and locations are suitable for biopsy; (4) a lack of adequate material and high-quality DNA may lead to testing failure; (5) genetic mutations may change throughout treatment, while repeated biopsies are impractical. In addition, the economic and time costs of biopsy should also be considered [8,9,10,11]. Therefore, there is an urgent need for a reliable, safe, convenient, and cost-effective method for the non-invasive prediction of EGFR mutation status in lung adenocarcinoma patients, to assist clinicians in selecting appropriate patients for EGFR-TKI treatment, support individualized decision-making, maximize the prognosis of the patient, and also avoid waste of medical resources.

As an emerging data mining technique, radiomics has attracted increasing attention for its advantages in providing objective and quantifiable imaging information, which can be used for differential diagnosis, genetic analysis, clinical staging, therapeutic evaluation, and prognosis prediction. The main steps of radiomics analysis are as follows: (1) acquisition and pre-processing of medical images (CT, MR, X-ray, ultrasound, PET, and so on); (2) segmentation of volumes of interest (VOI), which can be done manually by radiologists or automatically or semi-automatically by software; (3) feature extraction, extracting high-throughput features from VOIs, including shape features, first-order statistical features, texture features, and higher-order statistical features; (4) feature selection, excluding the non-repeatable, redundant, and irrelevant features from a large number of extracted features; (5) model construction, constructing the prediction model based on machine learning methods for a specific clinical problem, and training it [12,13,14].

Recent studies have demonstrated that radiomic features extracted from lung CT images can predict EGFR mutation status [10, 15,16,17,18]. However, most studies focus on intratumoral lesions and give little attention to subtle changes in the peritumoral region. Recent cancer studies have shown that as cancer infiltrates and metastasizes, the lung parenchyma surrounding the tumor may also be affected, and changes in the microenvironment, such as tumor angiogenesis, lymphangiogenesis, microvascular and lymphatic infiltration can provide valuable clinical information, which may reflect the biological behavior of the tumor, thus helping the characterization of tumor aggressiveness and the predicted prognosis of tumors [19, 20]. Therefore, mining peritumoral radiomic features may identify new biological markers for the non-invasive prediction of EGFR mutation in lung adenocarcinoma. We hope to develop a radiomics model combining intratumoral and peritumoral features to predict EGFR mutation status in lung adenocarcinoma patients non-invasively. We will explore the optimal peritumoral range corresponding to the highest AUC of the prediction model, which may be helpful for targeted therapy of lung adenocarcinoma.

Materials and methods

This retrospective study was approved by The Second Xiangya Hospital, Institutional Review Board (No. 2022K012), which waived the requirement for patients’ informed consent referring to the Council for International Organizations of Medical Sciences (CIOMS) guidelines.

Patients

We finally collected three datasets for analysis. Figure 1 shows the patients’ inclusion flowchart and datasets partition. Dataset 1 and dataset 2 were collected from two hospitals with the following inclusion criteria: (1) available non-contrast enhanced thin-slice chest CT (0.75–1.5 mm) scan before biopsy or surgical treatment; (2) available pathological reports of lung adenocarcinoma; (3) available EGFR mutation testing reports; and (4) no any prior treatment before EGFR mutation analysis. Dataset 3 was collected from the Cancer Imaging Archive (TCIA) public database with the following inclusion criteria: (1) available non-contrast enhanced CT images with slice thickness ≤ 1.5 mm (to avoid data inconsistency); (2) available pathological reports of lung adenocarcinoma; (3) available EGFR mutations testing reports; and (4) the lesions that could be certainly identified as the resected or biopsied lesions. Patients with CT images slice thickness > 1.5 mm, pathologically confirmed non-lung adenocarcinoma, and without EGFR mutations testing reports were excluded. CT acquisition and scanning parameters for dataset 1 and dataset 2 were presented in Supplementary Material 1.

Fig. 1
figure 1

Patients’ inclusion flowchart and datasets partition. EGFR+— EGFR mutant; EGFR-—EGFR wild-type

A total of 779 patients were included in this study which were divided into EGFR + or EGFR– groups. Dataset 1, including 640 patients collected from the Huadong Hospital from January 2013 to December 2018, was randomly divided into a training set (384 patients, 60.0%), a validation set (128 patients, 20.0%), and an internal testing set (128 patients, 20.0%). Dataset 2, including 103 patients collected from the Second Xiangya Hospital from January 2020 to March 2021. Dataset 3, including 36 patients from TCIA. Dataset 2 and Dataset 3 were combined as an independent external testing set.

Tumor segmentation and radiomic feature extraction

Firstly, intratumoral VOIs (VOI_I) were delineated manually along the lesion on every slice until the entire lesion was covered by a radiologist with 5-year experience in chest radiology and then confirmed or modified by a radiologist with 10-year experience in chest radiology using 3Dslicer software (version 4.10.1, Brigham and Women’s Hospital). In patients with multiple lesions, only one lesion was delineated due to the limited availability of EGFR testing reports. Secondly, to augment the spatial dimensions of tumor regions in our dataset, we employed a dilation technique facilitated by the “SimpleITK” library in Python to automatically expand VOI_I by 1 mm, 2 mm, 3 mm, 4 mm, 5 mm, 10 mm, and 15 mm. In essence, this approach involves enlarging the tumor mask by a specified distance in millimeters. The tumor region was represented as a binary mask, where the tumor cells were marked as 1 and the background regions were denoted as 0. The dilation of the tumor mask was then achieved using a spherical structuring element, corresponding to the desired extension distance. These peritumoral regions included air in the lungs, pulmonary vessels, and bronchi and did not include the chest wall and mediastinum. Figure 2 shows the process of tumor segmentation and its expansion into the peritumoral region. Finally, three kinds of regions were created: (1) intratumoral regions only (VOI_I); (2) peritumoral regions only (VOI_P), VOI_P1, VOI_P2, VOI_P3, VOI_P4, VOI_P5, VOI_P10, and VOI_P15; (3) intratumoral and peritumoral regions (combined), VOI1, VOI2, VOI3, VOI4, VOI5, VOI10, and VOI15. Images with VOI information were exported with NII format for the next step of analysis.

Fig. 2
figure 2

The workflow of the study. VOI—volume of interest; EGFR—Epidermal growth factor receptor; EGFR+—EGFR mutant; EGFR−—EGFR wild-type; LASSO—the least absolute shrinkage and selection operator; mRMR—the minimum redundancy maximum relevance algorithm

The original images were resampled at the same voxel size of 1*1*1 mm3 by cubic interpolation to achieve spatial resolution. Hounsfield Units (HU) were standardized by setting consistent window levels across all images, typically ranging from -1000 HU (air) to 1000 HU (bone). Bias in intensity non-uniformities was corrected to account for variations in scanner characteristics. Then, the Wavelet filter, Laplacian of Gaussian filter, Square filter, SquareRoot filter, Logarithm filter, and Exponential filter were used to pre-process the original images.

International Biomarker Standardization Initiative compliant radiomic features were extracted from these VOIs using Pyradiomics package (version 3.0.1) in Python. From original images and filtered images, a total of 1454 radiomic features were extracted from each VOI, including 288 first-order features, 14 shape features, and 1152 texture features. Texture features included Gray Level Co-occurrence Matrix (GLCM), Gray Level Size Zone Matrix (GLSZM), Gray Level Run Length Matrix (GLRLM), Neighboring Gray Tone Difference Matrix (NGTDM), and Gray Level Dependence Matrix (GLDM) features. The details of these features were presented in Supplementary Table S6.

Feature selection and model construction

A three-step method was used to select radiomic features. First, the student’s t-test initially selected significantly different features between the EGFR + and EGFR- groups (p < 0.05). Next, the features with p < 0.05 were further selected by the least absolute shrinkage and selection operator (LASSO), tenfold cross-validation was applied to determine the optimal tuning parameter λ value, and then features with nonzero coefficients were selected. After removing the irrelevant or redundant features, we used the minimum redundancy maximum relevance (mRMR) algorithm to identify the most important features based on a heuristic scoring criterion and retained only the top-ranked features.

The optimal selected features were used to construct three kinds of radiomics models: (1) VOI_I model, a model with intratumoral radiomics alone; (2) VOI_P model, a model with peritumoral radiomics alone; (3) combined model, a model combining intratumoral and peritumoral radiomics. Multiple machine learning classifier algorithms, including Random Forest (RF), K-nearest neighbors (KNN), Logistic Regression (LR), Extremely Randomized Trees (ExtraTrees), CatBoost, eXtreme Gradient Boosting (XGBoost), NeuralNetFastAI, NeuralNetTorch, and Light Gradient Boosting Machine (LightGBM) were analyzed to determine the optimal classifier algorithm. Descriptions of these classifier algorithms and the optimal classifier algorithm corresponding to each VOI were shown in Supplementary Material 2 and Supplementary Table S1. For each VOI, the respective optimal classifier algorithm was selected to construct the radiomics models, respectively. The predictive performance of each model was evaluated using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and F1 score.

Statistical analysis and model evaluation

The mean and standard deviations were expressed for continuous variables and frequency (percentage) for categorical variables. ANOVA and the chi-square test (or Fisher’s exact test) were used to assess statistical differences in continuous and categorical variables across three datasets, respectively. Statistical analyses were performed using the SPSS 27.0 software (IBM Corp, Armonk, USA). The predictive performance of the models was evaluated using AUC, accuracy, sensitivity, specificity, and F1 score. The DeLong test was used to assess the differences in AUC between different models. p < 0.05 indicated a significant difference.

Results

Clinical characteristics of patients

The clinical characteristics of patients are shown in Table 1. A total of 779 patients (345 males, 434 females) were included in this study. 399 patients (51.2%) were classified as EGFR mutant (EGFR+), while 380 (48.8%) were classified as wild-type (EGFR−). There were significant differences in smoking status, tumor subtype, and EGFR mutation status among patients in the three datasets.

Table 1 Clinical characteristics of patients

Feature selection

After performing t-test, LASSO (Figs. 3, 4), and mRMR (Supplementary Fig. S1), a total of 262 highly predictive radiomic features were selected from 15 VOIs, including 10 first-order features, 17 shape features, and 235 texture features. The details of finally selected features and their importance for each VOI are presented in Table 2 and Supplementary Fig. S2, and features selected for VOI4 and their importance are presented in Fig. 5. Noting that each selected feature group of 15 VOIs included texture features and one shape feature (shape_Flatness feature), while only VOI_P1, VOI_P2, VOI_I, VOI1, VOI10, and VOI15 included first-order features.

Fig. 3
figure 3

Feature selection based on the least absolute shrinkage and selection operator (LASSO) method. Identification of the optimal parameter λ in the LASSO model using tenfold cross-validation, drawing vertical lines at the optimal values via minimum criteria

Fig. 4
figure 4

LASSO coefficient distributions of radiomic features for each VOI. Drawing vertical lines at the values selected using tenfold cross-validation, and features with nonzero coefficients in the LASSO regression model were the most predictive features

Table 2 The details of finally selected features for each VOI
Fig. 5
figure 5

Radiomic features selected for VOI4 and their importance

Predictive performance of VOI_I model and combined models

In the training and validation sets, the VOI_I model performed well with an AUC of 0.728, and the AUCs of VOI2, VOI3, VOI4, VOI5, VOI10, and VOI15 models were higher than that of VOI_I model, which were 0.763, 0.770, 0.877, 0.734, 0.761, 0.790, respectively, with VOI4 model having the highest AUC, accuracy, sensitivity, and F1 score (Table 3, Fig. 6a).

Table 3 Predictive performance of VOI_I model and combined models
Fig. 6
figure 6

ROC curves of VOI_I model and combined models in the training and validation sets (a), internal (b) and external (c) testing sets, and the difference in AUC between VOI_I model and VOI4 model in the internal and external testing sets, respectively. Blue represents VOI_I model and green represents VOI4 model. VOI—volume of interest; AUC—area under the curve; ROC—Receiver operating characteristic; CI—confidence interval

In the internal testing set, the AUCs of VOI3, VOI4, and VOI15 models were higher than that of VOI_I model (AUC = 0.698), which were 0.700, 0.727, and 0.707, respectively, with VOI4 model having the highest AUC, accuracy, sensitivity, and F1 score (Table 3, Fig. 6b).

In the external testing set, the AUCs of VOI2, and VOI4 models were higher than that of VOI_I model (AUC = 0.653), which were 0.673, and 0.701, respectively, with VOI4 model having the highest AUC, accuracy, and specificity (Table 3, Fig. 6c).

In addition, we used the DeLong test to evaluate the difference in AUC between models in the internal and external testing sets, respectively (Supplementary Table S2, Supplementary Table S3). For VOI4 model, the AUC was significantly different from that of VOI_I in the external test set (p = 0.0006) (Fig. 6d).

Predictive performance of VOI_P models

Compared to other VOI_P models, the model based on the peritumoral 15 mm (VOI_P15) features alone achieved the best performance in the training and validation sets, internal testing set, and external testing set, with AUCs of 0.861, 0.716, and 0.704, respectively (Table 4, Fig. 7). The results of DeLong test are presented in Supplementary Table S4 and Supplementary Table S5.

Table 4 Predictive performance of VOI_P models
Fig. 7
figure 7

ROC curves of VOI_P models in the training and validation sets (a), internal (b) and external (c) testing sets. ROC—Receiver operating characteristic; VOI—volume of interest; AUC—area under the curve; CI—confidence interval

Discussion

In this study, we constructed three kinds of radiomics models: (1) intratumoral model (VOI_I model); (2) peritumoral model (VOI_P model); (3) intratumoral and peritumoral model (combined model). We found that combined models showed great promise in predicting the EGFR mutation status of lung adenocarcinoma patients. The best prediction performance was obtained by VOI4 model, with the highest AUCs of 0.877, 0.727, and 0.701 in the training and validation sets, the internal testing set, and the external testing set, respectively.

To our knowledge, few studies have revealed the added value of peritumoral radiomics in predicting EGFR mutation status in lung cancer. Choe et al. demonstrated that the predictive model combining intratumoral and peritumoral radiomic features performed slightly better in the training set than the intratumoral model, but the difference was not statistically significant (AUC = 0.66 vs. 0.64, p = 0.504), whereas, in the validation set, the AUC was lower than that of the intratumoral model (AUC = 0.56 vs. 0.62) [21]. Another study showed that compared to intratumoral radiomics alone, combining intratumoral and peritumoral 3 mm radiomic features significantly improved the predictive performance of EGFR mutation status in primary lung cancer (AUC = 0.730 vs. 0.774, p < 0.001), and in lung adenocarcinoma only (AUC = 0.687 vs. 0.630, p < 0.001) [22]. However, this study did not determine whether the 3 mm peritumoral region was optimal for evaluating peritumoral features. Ideally, to determine the best peritumoral range, we should extract features from different peritumoral ranges to construct models separately and compare their predictive performance. A recent study compared radiomic features of multiple peritumoral regions (3 mm, 5 mm, 7 mm) and constructed three machine learning models to predict EGFR mutation status in NSCLC. The results showed that combining intratumoral and peritumoral 3 mm radiomic features could better distinguish EGFR+ from EGFR− groups than 5 mm and 7 mm (training, p = 0.0000, test, p = 0.0025), but this study included only 164 patients and did not validate models with an external dataset [23]. Based on this, we expanded VOI_I outwards by 1 mm, 2 mm, 3 mm, 4 mm, 5 mm, 10 mm, and 15 mm to identify seven peritumoral regions and combined them with intratumoral regions to generate seven intratumoral and peritumoral regions, respectively, to compare the complementary value of different peritumoral regions to the predictive performance of radiomic models. In addition, compared to the previous studies, our study used a larger training cohort and was tested in an independent internal testing set and an external testing set. As a result, our model may be more effective in illustrating the differences in radiomic features between EGFR+ and EGFR− groups.

According to the results, the peritumoral region of lung adenocarcinoma may also provide important predictive information about EGFR mutations, with the best predictive performance achieved by combining intratumoral and peritumoral 4 mm radiomic features. Tumor cells are usually highly invasive and tend to migrate from the primary tumor to the surrounding parenchyma, disrupting the normal structure and causing morphological and textural changes in the peritumoral region. These changes are difficult to detect on medical images, whereas radiomic features extracted from CT images can quantitatively reflect subtle changes in the microenvironment surrounding the tumor that cannot be recognized by the naked eye, this may be the pathophysiological basis for the improved predictive performance of the combined models over the VOI_I model. Lung adenocarcinomas have obvious cellular and mutational heterogeneity. The concept of tumor heterogeneity applies not only to tumor epithelial cells but also to the various microenvironments with which the tumor cells interact, such as vasculature, cancer-associated fibroblasts, extracellular matrix, and infiltrating immune cells. Tumor cells can influence their microenvironment by releasing cell signaling molecules that promote tumor angiogenesis and induce immunological tolerance. Meanwhile, immunocytes infiltrated in the tumor microenvironment can secrete a large number of cytokines and chemokines to promote the epithelial-mesenchymal transition of tumor cells, which allows tumor cells to invade and metastasis [24].

The tumor margin is an important meeting place in the tumor microenvironment where immune and stromal cells are highly active and interact with the tumor. The microenvironment at tumor invasion edges differs from that of the tumor core. Hypoxia tends to be associated with the center of the tumor, whereas oxygen is primarily present at the tumor periphery. Monocytes in the blood are recruited around tumor cells by various chemokines and cytokines, thus becoming tumor-associated macrophages, which can promote the invasion of tumor cells by supplying pro-migratory factors such as epidermal growth factor, or by promoting extracellular matrix proteolytic remodeling, and play an important role in the invasion process of the tumor margin. Furthermore, under hypoxic conditions, tumor-associated macrophages promote tumor cell release of vascular endothelial growth factor and platelet-derived growth factor via the activation of the hypoxia-inducible factor-1 pathway, thus promoting tumor angiogenesis, providing oxygen and nutrients for tumor growth, and contributing to tumor cell invasion and metastasis. In addition, tumor-associated fibroblasts are also abundant at the tumor margin, promoting tumor proliferation, angiogenesis, invasion, and metastasis by secreting various growth factors, cytokines, and inflammatory chemokines [25, 26].

As in several previous studies, the most predictive radiomic features finally selected in our study included a significant number of texture features (235 in total), which reflect the pattern and spatial distribution of voxel intensities within the VOI, indicating its biological heterogeneity [15]. Therefore, our results may suggest that tumor heterogeneity is associated with EGFR mutation status in lung adenocarcinoma. Regarding the shape features, the shape_Flatness feature was found in all of the final selected features of 15 VOIs, which shows the relationship between the largest and smallest principal components in the VOI shape, suggesting that this feature plays an important role in predicting EGFR mutation status. However, unlike most other studies [16, 22, 27, 28] there were no first-order features in our best predictive model (VOI4). The first-order features describe the distribution of voxel intensities within the target region through commonly used and basic metrics, but it is difficult to measure the spatial distribution characteristics of voxels without considering the neighborhood relationship between voxels [29]. In our best predictive model, they are not critical predictive features.

In addition, we found that features from independent peritumoral regions also had predictive value for the prediction of EGFR mutations. Compared to other peritumoral radiomics models, the model based on the peritumoral 15 mm (VOI_P15) features achieved the best performance in the training and validation sets, the internal testing set, and the external testing set, with AUCs of 0.861, 0.716, and 0.704, respectively. However, this was inconsistent with findings that as peritumoral distance increased, the VOI comprised more normal lung tissue and relatively less tumor tissue, making the predictive performance of the model decreased [30]. The probable explanation was that radiomic features were more stable as peritumoral distance increased [31]. Tunali et al. also demonstrated that some radiomic features, including statistical features, histograms, and some texture features (GLCM, GLRLM, GLSZM, and NGTDM), had good stability and reproducibility regardless of peritumoral distance, indicating that these features were less influenced by changes in the size or shape of peritumoral regions caused by different segmentation and image acquisition [31]. It was generally consistent with the features eventually selected in our study, and these stable and reproducible features were more likely to construct robust radiomics models, allowing multicenter studies to maximize the clinical utility of radiomics models [32].

To achieve more generalizable and impactful results in radiomics, researchers need to obtain large patient cohorts by combining images from multiple institutions. However, most current radiomics studies collect imaging data retrospectively, and image acquisition protocols, processing or reconstruction settings, and imaging scanners may be different from different institutions, resulting in poor reproducibility and repeatability of radiomic features [33,34,35]. Therefore, in order to discover more reliable and stable radiomic features and apply them in multicenter clinical practice, image consistency must be improved by controlling imaging protocols in order to build a public database with a large amount of high-quality data [36]. In addition, several studies have demonstrated that the use of harmonization methods in the image domain (prior to feature extraction) or spatial domain (within or after feature extraction) would be beneficial in the design of multicenter studies. According to recent studies, ComBat harmonization is a fast and easy-to-use feature harmonization method in the feature domain that allows the correction of radiomic features to reduce the variation caused by different imaging protocols [37,38,39]. It was first proposed by Johnsond et al. [40] for genetic studies and was later used by Fortin et al. for medical imaging applications [41], and by Orlhac et al. [42] for PET radiomics studies, and had produced great results in several subsequent studies [39, 43, 44]. Among them, Shiri et al. demonstrated that ComBat harmonization could significantly improve the prediction performance when radiomics to predict EGFR mutation status in NSCLC, and the range of mean AUC increased from 0.87–0.90 to 0.92–0.94, which proved the effectiveness of ComBat harmonization [43]. Therefore, we can try to apply ComBat harmonization to further improve the prediction performance of the model in future.

Despite the encouraging results, there are still some limitations. First, we included some lung adenocarcinoma patients as an external testing set to validate the reliability and stability of the model, however, due to the small sample size, its predictive efficiency may be limited, and multi-institutional image data are needed to assess the generalizability of our findings in future; second, the incidence of EGFR mutation varies greatly across different races, with a significantly higher incidence in Asian populations [45]. The patients used for model training in our study were all Asians, making the results lacking in generalizability and requiring further validation in patients of other races; furthermore, some other potentially valuable factors such as smoking status and gender were not included in this study, and we will combine radiomic features with these clinical features for further research to improve the predictive performance of the model in future.

In conclusion, radiomic features extracted from the peritumoral region can add extra value in predicting the EGFR mutation status of lung adenocarcinoma patients, with the optimal peritumoral range of 4 mm. This may partially prove the clinical value of peritumoral microenvironment in cancer diagnosis.