Introduction

Pulmonary nodules are prevalent during CT screening, with at least one nodule detected in up to 51% of initial screenings [1]. Despite over 95% of the nodules being ultimately determined as benign, a significant number of malignant pulmonary nodules are still detected due to the vast number of cases screened [2, 3]. Lung adenocarcinoma (LAC) is the dominant histological subtype of malignant pulmonary nodules [4, 5]. As per the pathological classification by the World Health Organization, LAC is categorized into three types based on the level of invasiveness—adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA), and invasive adenocarcinoma (IAC) [6]. In 2020, the International Association for the Study of Lung Cancer (IASLC) reclassified IAC into three grades based on varying levels of differentiation including well-differentiated IAC (WIAC), moderately differentiated IAC (MIAC), and poorly differentiated IAC (PIAC), which proved superior to models incorporating nuclear or cytologic grade [7]. Each LAC subtype exhibits distinct biological characteristics and prognosis. As a preinvasive lesion, AIS is typically managed through follow-up surveillance, but some nodules may progress to MIA or IAC [8]. AIS and MIA perform the excellent prognosis after sub-lobar resection [9], while poorly and moderately differentiated IAC exhibit higher postoperative recurrence rates compared to well-differentiated IAC [7, 10]. Therefore, accurately identifying the invasiveness and differentiation of adenocarcinoma to classify LAC may provide guidance for surveillance, surgical strategy, and prognosis based on preoperative CT images.

Traditionally, the classification of LAC subtypes relies on visual assessment and verbal description of radiological features. Previous investigations have highlighted the significance of nodule characteristics, including size, type, margin, pleural indentation, vacuole sign, and vascular convergence sign, in determining the pathological nature of pulmonary nodules [11,12,13,14]. However, accurate classification heavily relies on the expertise and diagnostic proficiency of radiologists. Li et al. found that senior radiologists exhibited superior predictive capabilities in discerning the grading of IAC compared to their junior counterparts [15]. Presently, histological classification necessitates invasive tissue sampling through surgery or needle biopsy, which can be burdensome. Therefore, the development of a non-invasive and convenient approach to anticipate the histological subtypes of pulmonary nodules holds significant clinical implications.

Radiomics, a highly promising methodology, involves the extraction of numerous high-dimensional, retrievable features from medical imaging data, either independently or in conjunction with clinical features [16,17,18]. This approach has shown utility in distinguishing between benign and malignant lung nodules, predicting the invasiveness of lung adenocarcinoma, and identifying the preoperative IASLC grade of IAC [19,20,21,22]. Several studies have developed radiomics models to categorize and predict the pathological attributes of specific nodules, such as pure ground-glass nodules (pGGNs), mixed ground-glass nodules (mGGNs), and solid nodules (SNs), demonstrating robust predictive capabilities [23,24,25]. However, these studies have primarily focused on two or three-classification radiomics, thus lacking coverage of the majority of nodule types and new pathological gradings. Therefore, the development of a multi-classification radiomics approach that can predict the pathological invasiveness and differentiation of pulmonary nodules holds greater clinical value and practicality [26].

With this foundation in mind, the objective of this investigation is to construct a five-classification radiomics model integrating clinic-radiological features for the prediction of invasiveness and differentiation of adenocarcinoma nodules, encompassing AIS, MIA, WIAC, MIAC, and PIAC. The ultimate aim is to establish a non-invasive approach that enables comprehensive assessment of the histological classifications of pulmonary nodules.

Materials and methods

Patients

A total of 951 patients who underwent complete resection for suspected lung cancer were included in this study, with clinical data and preoperative CT images collected between January 2020 and December 2022. To enhance the homogeneity of the patient cohorts, specific exclusion criteria were applied, which encompassed: (1) patients with confirmed non-adenocarcinoma histology, such as squamous carcinoma, mucinous adenocarcinoma, metastases, and others (n = 187); (2) lung nodules larger than 3 cm in diameter (n = 192); (3) patients who received clinical treatment and needle biopsy prior to the CT examination (n = 76); (4) CT images of inadequate quality (n = 27); and (5) patients with lymph-node metastases (n = 45). Ultimately, a total of 500 patients (201 men and 299 women) were retrospectively enrolled in this study, with a median age of 59 years (age range: 19–83 years). Among these patients, there were 97 cases of AIS, 139 cases of MIA, 99 cases of WIAC, 84 cases of MIAC, and 81 cases of PIAC. The detailed process of patient recruitment is presented in Fig. 1.

Fig. 1
figure 1

Flowchart of the patient selection

Histopathological evaluation

All lung specimens that underwent surgical resection were meticulously examined following the 2021 WHO classification of thoracic tumors and the newly proposed grading system by the IASLC [6, 7]. The diagnosis of LAC was made based on comprehensive histologic patterns, which encompassed lepidic, acinar, papillary, micropapillary, solid, cribriform, and complex glandular patterns [27]. The proportion of each histologic pattern was recorded in 5% increments to determine the predominant histologic subtype and quantify any patterns for tumor grading. Adenocarcinoma was categorized into two groups based on the degree of invasion: non-IAC (including AIS and MIA) and IAC (including WIAC, MIAC, and PIAC). Within the IAC group, low-grade subtypes were further divided into WIAC and MIAC, while high-grade subtypes (PIAC) were also considered. This grading scheme was showed in Table 1 [7]. According to the pathological invasiveness and differentiation of pulmonary nodules, the definition of two-, three-, five-classification task was as follows: two-classification was (AIS, MIA) vs (WIAC, MIAC, and PIAC), three-classification was (AIS, MIA) vs (WIAC, MIAC) vs (PIAC), and five-classification was (AIS) vs (MIA) vs (WIAC) vs (MIAC) vs (PIAC).

Table 1 Grading scheme for invasive adenocarcinomas of pulmonary nodules

CT acquisition

The patients underwent CT plain imaging of the lungs using one of three CT systems: Somatom Definition AS 64 (Siemens Healthcare, Germany), Somatom Definition Flash (Siemens Healthcare, Germany), GE Discovery CT750 HD (GE Medical Systems, USA). Patients were scanned in the supine position with complete inspiration and breath-holding, from the apex of the lung to the diaphragm. The scanning process utilized a tube voltage of 120 kV, automatic tube current, and reconstruction slice thickness and interval of 1 or 1.25 mm. Reconstruction settings included a lung window with a mean of -500 HU and a width of 1500 HU, with a matrix size of 512 × 512.

Clinical and radiological features

The clinical data, including gender, age, smoking history, hypertension, diabetes, and neoplasia history, was retrieved from the hospital information system. The radiological characteristics were assessed by two thoracic specialists, and a consensus was reached. These characteristics encompassed the involved lobe (right upper lobe, right middle lobe, right lower lobe, left upper lobe, and left lower lobe), nodule shape (regular and irregular), nodule type (including pGGNs, mGGNs, and SNs), boundary (clear or blurred border definition), lobulation (indentation at the edge of a round or oval lesion), speculation (linear strands extending into the lung parenchyma but not touching the pleural surface), vacuole (small focal areas of low attenuation within the nodule), air bronchogram (tubelike or branched air structure within the nodule), vascular convergence (multiple supplying vessels converging toward the lesion), pleural retraction (linear strands extending toward the pleura or major/minor fissure from the mass, causing pleural distortion), bronchial cut-off (sudden truncation of a bronchus due to obstruction within the nodule), and presence of abnormal vessels within the nodules (distorted, dilated, and complicated vessels within the lesions) [14, 28,29,30].

Pulmonary nodules’ segmentation

A single radiologist (with 10 years of experience in chest imaging), who was blinded to the pathological results, conducted semi-automated lesion segmentation on CT images with lung window settings using the Radcloud Platform [31] (version 7.5; Huiying Medical Technology Co., Ltd., Beijing). The segmentation process involved delineating the pulmonary nodules on a section-by-section basis to generate a three-dimensional region of interest (ROI). After a month, 50 cases were randomly selected, and the same radiologist repeated the segmentation to assess intra- and interobserver reproducibility. The aforementioned segmentation results were further validated by an experienced radiologist (with 15 years of experience in chest imaging).

Radiomics and clinic-radiological features extraction and selection

The extraction of image features plays a fundamental and crucial role in radiomics analysis, as it enables the identification of relevant features that effectively capture the biological characteristics of lesions and tumor heterogeneity. In this study, the Radcloud platform was utilized for the extraction of radiomics features. Specifically, the platform employed PyRadiomics [32] (version 3.1.0, https://pyradiomics.readthedocs.io/), a Python-based library, to extract a comprehensive set of radiomics features from the medical images. To enhance the reproducibility of the radiomics analyses, pre-processing steps were meticulously addressed. Prior to feature extraction, \(z\) normalization of CT images was conducted using PyRadiomics. Additionally, grayscale discretization employed fixed Bin Width values set at 25HU, and voxel size resampling was executed at 1 × 1 × 1 mm3 using PyRadiomics [33].

In our study, we extracted a total of 1688 image features belonging to five major categories from the ROI of each patient. These categories include first-order statistics, 3D shape features, gray-level co-occurrence matrix (GLCM) features, gray-level run length matrix (GLRL) features, gray-level size zone matrix (GLSZM) features, neighboring gray tone difference matrix (NGTDM) features, and gray-level dependence matrix (GLDM) features. Notably, shape features were solely derived from the original images, while the remaining features were obtained by applying various filters such as wavelet, square, square root, gradient, logarithm, exponential, local binary pattern in 2D (LBP-2D), and local binary pattern in 3D (LBP-3D). For the extraction of texture features, preprocessed CT images underwent wavelet filtering, which involved transforming the VOI into the wavelet domain while preserving low-pass (LLL) and high-pass (HHH) subbands and assigning different weights to other subbands (LHL, LHH, LLH, HLL, HHL, and HLH). Additionally, the LBP-3D image type comprised three subcategories, including the kurtosis map (LBP-3D-k), as well as two categories calculated using different levels of spherical harmonics, namely LBP-3D-m1 and LBP-3D-m2. It is worth mentioning that all the aforementioned radiological features adhere to the Image Biomarker Standardization Initiative (IBSI, https://theibsi.github.io).

Prior to selecting radiomics features, Z-score normalization was applied to all features. Each patient possessed a total of 1688 features, resulting in a significant amount of redundancy. To avoid diminishing the predictive performance of the model and to reduce computational time, it is necessary to perform feature selection in training set before model development. First, the evaluation of interobserver reproducibility for radiomics features was conducted utilizing the intraclass correlation coefficient (ICC). Specifically, ICC values falling below 0.5, between 0.5 and 0.75, between 0.75 and 0.9, and exceeding 0.90 are indicative of poor, moderate, good, and excellent reliability, respectively [34]. Consequently, features with ICC values surpassing 0.75 were retained for subsequent stages of feature selection. Subsequently, a variance threshold of 0.8 was employed to refine the feature selection process. Furthermore, the univariate analysis method, SelectKBest, was utilized to identify features with a p value less than 0.05 for further analysis. Finally, the least absolute shrinkage and selection operator (LASSO) regression method with tenfold cross-validation was employed to assist in feature selection, aiming to identify relevant and informative features associated with lung cancer classification.

On the other hand, LASSO regression is also applicable for the selection of clinic-radiological features [35], integrating radiological scores with independent clinical risk factor scores to establish a predictive model for lung cancer classification. Specifically, LASSO shrinks all regression coefficients close to zero based on a regularization parameter λ and precisely sets the coefficients of many irrelevant features to zero. To determine the optimal value of λ, we employed a tenfold cross-validation with a minimum criterion, resulting in the λ value that yielded the lowest cross-validation error. The retained non-zero coefficient features were used to fit the regression model and combined into a radiomics and clinic-radiological features model. The predicted values of the model for each patient were computed through a linear combination weighted by the correlation coefficients of the selected features.

Given the involvement of two-classification, three-classification, and five-classification tasks in our study, it is essential to note that the LASSO labels employed during feature selection are tailored to the specific categorization requirements of each respective task. In other words, the labels used in the LASSO regularization process correspond uniquely to the distinct classification schemes associated with the two-classification, three-classification, and five-classification tasks. This tailored approach ensures the relevance and appropriateness of the selected features for each specific classification task within our study framework.

Development of machine learning (ML) models

The features extracted from lesion segmentation using computer learning techniques are subjected to data analysis and model construction, enabling the reflection of lesion information and prediction of the lesions. The radiomics dataset comprises a training set for training and a testing set for model testing. Currently, commonly used radiomics models include logistic regression (LR), support vector machine (SVM), K-nearest neighbors (KNN), decision tree (DT), random forest (RF), gradient boosting decision tree (GBDT), among others. One versus rest (OVR) and one versus one (OVO) are two well-known strategies that decompose multi-class classification problems into multiple binary classification problems. Since LR and SVM are two-classification models, this study categorizes them into LR-OVR, SVM-OVR, LR-OVO, and SVM-OVO for three-classification and five-classification tasks, while also comparing them with other multi-classification models, such as KNN, DT, RF, and GBDT.

Validation of the optimizing ML models

In the evaluation of the testing cohort, the performance of the two-classification problem was assessed using quantitative measures, such as accuracy (ACC), sensitivity, specificity, and the area under the receiver-operating characteristic curve (AUC). Furthermore, considering the inherent characteristics of the multi-classification problem, the predictive capabilities of the models designed for multiple classes were examined by computing macro- and micro-averaged AUCs [36]. Additionally, macro-average accuracy, F1-score, recall, and precision were calculated to evaluate the classification performance of the multi-classification models. Notably, in addition to the radiomics model, the same methodology was employed to develop the clinic-radiological model and combined model, with the objective of verifying whether the inclusion of clinical variables enhances the classification performance of the machine learning models for pulmonary nodules. Our study flow diagram is shown in Fig. 2.

Fig. 2
figure 2

Workflow of necessary steps in current study. LASSO least absolute shrinkage and selection operator, ROC receiver-operating characteristic

Statistical analyses

The statistical analyses were conducted using R software (version 4.2.1; https://www.r-project.org/) to compare the differences in clinical and radiological data among the five groups. For categorical variables, the Chi-square test was employed, and in terms of quantitative variables, the Mann–Whitney U test was applied. The overall performance of the multi-classification models in the development and testing cohorts was evaluated through receiver-operating characteristic (ROC) curve analysis and the calculation of micro- and macro-AUC. All statistical tests were two-sided, and a significance level of P < 0.05 was deemed statistically significant throughout the entire study duration.

Results

Patient characteristics

Table 2 displays the comprehensive clinical and radiological features observed in a cohort of 500 patients. The enrolled patients underwent random allocation into a training set (n = 349) and a testing set (n = 151) at a ratio of 7:3. Statistical analysis revealed significant differences among the five histological subtypes for various clinical and radiological features, including gender, smoking history, hypertension, age, nodule boundary, lobulation sign, bronchial cut-off sign, speculation sign, vacuole sign, air bronchogram sign, vascular convergence sign, pleural retraction, abnormal vessels within nodules, and nodule type (p < 0.05).

Table 2 The clinical and radiological features of patients in the histological classifications

Consequently, two-classification system comprised non-IAC (AIS, MIA, n = 236) and IAC (WIAC, MIAC, and PIAC, n = 264). The three-classification system comprised non-IAC (AIS, MIA, n = 236), low-grade IAC (WIAC, MIAC, n = 183), and high-grade IAC (PIAC, n = 81). In the five-classification system, the subtypes included AIS (n = 97), MIA (n = 139), WIAC (n = 99), MIAC (n = 84), and PIAC (n = 81).

The selected radiomics and clinic-radiological features in varying classification models

After extracting a total of 1688 radiomics features, only those demonstrating good feature consistency (ICC≧0.75) were selected for further analysis. For the two-classification model, a combination of variance thresholding, SelectKBest, and LASSO regression methods was utilized to identify 31 non-zero coefficient radiomics features from the CT sequences. These selected features were then used to calculate the 2-Rad-score for each patient in both the training and testing cohorts. The calculation of the 2-Rad-score involved summing the products of the corresponding feature values and their respective weights. Similarly, for the 17 clinic-radiological features, LASSO regression was employed to extract 11 non-zero coefficient clinic-radiological features. This process resulted in the computation of the 2-clinic-radiological-score using the same methodology.

Following the same approach, the 3-rad-score and 3-clinic-radiological-score for the three-classification model were calculated using 39 retained radiomics features and 14 retained clinic-radiological features, respectively. Additionally, for the five-classification model, 26 radiomics features and 11 clinic-radiological features were preserved to calculate the 5-rad-score and 5-clinic-radiological-score, respectively.

Performance of the ML models in different classifications

Clinic-radiological, radiomics, and combined models were constructed for the two-classification task using LR, SVM, KNN, DT, RF, and GBDT algorithms. The performance of these ML models in the testing cohort was evaluated, and the results are presented in Table 3. All ML models exhibited satisfactory performance in predicting non-IAC and IAC. The SVM model showed the best overall performance, achieving the highest AUC and ACC values. Notably, the combined model demonstrated improved performance compared to the clinic-radiological and radiomics models, achieving an AUC of 0.942 and an ACC of 0.894, whereas the clinic-radiological and radiomics models achieved AUC values of 0.905 and 0.938, and ACC values of 0.848 and 0.868, respectively.

Table 3 The performance of ML models in testing cohort of two-classification model

Similarly, for the three-classification task, clinic-radiological, radiomics, and combined models were constructed using LR-OVR, SVM-OVR, LR-OVO, SVM-OVO, KNN, DT, RF, and GBDT algorithms. The performance of these models in predicting non-IAC, low-grade IAC, and high-grade IAC was evaluated, and the results are presented in Table 4. The SVM-OVO model demonstrated the best overall performance among all ML models in the testing cohort, regardless of the clinic-radiological, radiomics, or combined model. However, the combined model achieved a higher ACC compared to the other two models, with a value of 0.767 versus 0.740 and 0.753, respectively.

Table 4 The performance of ML models in testing cohort of three-classification model

The results of the five-classification task, as presented in Table 5, were consistent with the three-classification results. The SVM-OVO model exhibited the best overall performance in predicting AIS, MIA, WIAC, MIAC, and PIAC. Once again, the combined model outperformed the other two models in terms of ACC, achieving a value of 0.607 compared to 0.513 and 0.553.

Table 5 The performance of ML models in testing cohort of five-classification model

Optimal ML model classification evaluation

The SVM-OVO ML model, combined with radiomics and clinic-radiological features, exhibited excellent performance in classifying pulmonary adenocarcinoma nodules, as presented in Table 6. To visualize the correlation between radiomics features, clinic-radiological scores, and histological types, a cluster graph was constructed using representative patient data, as shown in Fig. 3.

Table 6 The performance of multi-classification using SVM-OVO for predicting histological classification of pulmonary adenocarcinoma nodules
Fig. 3
figure 3

The cluster graph provided a visual representation of the relative correlation between radiomics features, clinic-radiological scores, and histological types in a set of representative patients. CT images (ae upper) and corresponding histological images (ae lower, magnification × 200) of five representative patients are displayed. Patient 1, a 53-year-old woman with adenocarcinoma in situ (AIS), exhibited a CT image in the left upper lobe showing a ground-glass nodule (GGN) within a foci solid component (a). Patient 2, a 74-year-old woman with minimally invasive adenocarcinoma (MIA), presented a CT image in the right upper lobe displaying a GGN within a blurred vessel (b). Patient 3, a 58-year-old woman with well-differentiated invasive adenocarcinoma (WIAC), demonstrated a CT image in the right lower lobe depicting a mixed GGN with lobulation, short speculation, and pleural retraction (c). Patient 4, a 53-year-old woman with moderately differentiated invasive adenocarcinoma (MIAC), showed a CT image in the right lower lobe exhibiting a solid nodule (SN) with lobulation, short speculation, and pleural retraction (d). Patient 5, a 70-year-old man with poorly differentiated invasive adenocarcinoma (PIAC), presented a CT image in the right upper lobe revealing an SN with lobulation, more short speculation, pleural retraction, and vascular convergence (e). Furthermore, a comparison of 36 radiomics features and the clinic-radiological score among the five patients is illustrated in panel (f)

Figure 4 displays the confusion matrix of the combined model using SVM and SVM-OVO in both the training and testing cohorts. The matrix illustrates that the selected models were not prone to making errors and effectively captured the relationships among histological subtypes. The ACC in the testing cohort exceeded 0.6, even for the challenging five-classification task. All histological subtypes in the three and five-classification tasks were accurately identified. The macro-AUC and micro-AUC values of the three-classification model in the testing cohort were 0.884 and 0.896, respectively. Similarly, the macro-AUC and micro-AUC values of the five-classification model were 0.858 and 0.866, respectively. The AUC values of the histological subtypes ranged from 0.787 to 0.942, with the lowest AUC observed for MIAC in the testing cohort, as depicted in Fig. 5.

Fig. 4
figure 4

Confusion matrix on radiomics combined model: the two-classification in train cohort (a) and test cohort (b), the three-classification in train cohort (c) and test cohort (d), and the five-classification in train cohort (e) and test cohort (f)

Fig. 5
figure 5

ROC curve on radiomics combined model: the two-classification in train cohort (a) and test cohort (b), the three-classification in train cohort (c) and test cohort (d), and the five-classification in train cohort (e) and test cohort (f)

Discussion

In this research endeavor, our objective was to devise classification models for two, three, and five histological stratifications of pulmonary adenocarcinoma nodules by integrating radiomics features with clinic-radiological characteristics. Additionally, we conducted a comprehensive comparison of various machine learning techniques, including LR-OVR, SVM-OVR, LR-OVO, SVM-OVO, DT, KNN, RF, and GBDT, to identify the most suitable model for multi-classification tasks in predicting the histological subtypes of pulmonary nodules. Consequently, the SVM-OVO model emerged as the optimal choice, exhibiting superior overall performance in accurately predicting the histological subtypes of pulmonary nodules.

Undoubtedly, the accurate histological classification of LAC, which evaluates invasiveness and differentiation, is pivotal in determining appropriate treatment strategies. Typically, this classification relies on postoperative pathological examination following complete surgical resection [9, 37]. Consequently, the development of a non-invasive and convenient method to predict the histological classification of pulmonary nodules based on preoperative CT images holds substantial clinical significance. Radiomics, as a burgeoning field, has shown great promise in the diagnosis, treatment, and monitoring of pulmonary nodules, surpassing the capabilities of radiologist-based assessments, as evidenced by a growing body of literature [15, 21, 22, 38].

In our present investigation, we put forth multiple common machine learning models encompassing the diverse histopathologic stratifications of pulmonary adenocarcinoma nodules, as per the fifth WHO classification of lung tumors. This encompassed the subtypes of AIS, MIA, WIAC, MIAC, and PIAC. Upon evaluating their discriminatory abilities, we observed that the SVM model yielded the most favorable outcomes in the testing cohort for the two-classification task, while SVM-OVO demonstrated superior performance among the ML models in the testing cohort for the three- and five-classification tasks. The area under the AUC ranged from 0.787 to 0.942, and the ACC ranged from 0.607 to 0.894. Notably, our findings align with previous research that highlighted the satisfactory performance of classifying pulmonary adenocarcinoma nodules in the two-classification task [22, 25, 39]. Furthermore, our results indicate that the ACC achieved in the five-classification task surpassed the previous multi-classification of the predominant histologic pattern [40].

However, previous studies focusing on the classification of pulmonary nodules have generally overlooked the crucial aspect of selecting appropriate machine learning models. In light of this gap, our study sought to compare various models and assess their performance in classifying pulmonary nodules. Encouragingly, our findings aligned with prior research, demonstrating that the SVM outperformed other machine learning models in the testing cohort. SVM, widely employed for classification and predictive modeling tasks, has established itself as a reliable choice, even when confronted with limited data availability [41,42,43].

In the realm of multi-class classification, two widely adopted strategies are the OVO and OVR approaches. Park et al. utilized the OVO method to predict the three primary subtypes of lung adenocarcinoma, as described in their study [21]. Similarly, Chen et al. employed the OVR method to construct a three-classification model for the preoperative prediction of risk stratification in gastrointestinal stromal tumors [36]. In another investigation by Liu et al., a four-classification OVO model was established to differentiate subtypes of non-small cell lung cancer [44]. When comparing the performance of the ML models for multi-classification, it was consistently observed that the OVO approach outperformed the OVR method in the testing cohort, which aligns with the findings reported by Liu et al.

Previous studies have also highlighted the correlation between clinic-radiological features and the histological classification of pulmonary nodules. Consequently, experienced radiologists can utilize clinic-radiological features to classify the pathologic subtypes of pulmonary nodules [2, 21, 45]. Certain unique or characteristic features, such as the presence of minute airspaces or dilated vessels within the lesions, play a significant role in nodule classification. However, these specific features may not exhibit a correlation with the radiomics feature category [46]. To enhance the predictive capability of the radiomics model for the classification of pulmonary nodules, we developed a combined model that incorporates both radiomics and clinic-radiological features. Our study demonstrated that the predictive power of the combined model surpassed that of the radiomics and clinic-radiological models, regardless of whether the classification task involved two classes or multiple classes.

The predictive performance of high-grade IAC in the testing cohort of three-classification combined models was found to be lower compared to non-IAC and low-grade IAC, which is in contrast to the results obtained from the five-classification model. This discrepancy may be attributed to the relatively lower number of high-grade IAC patients, resulting in data bias. Additionally, the predictive power of MIAC in the five-classification combined model was lower than that of other subtypes. This could be attributed to the insufficient sample size, leading to uncertainties in the analysis and slight variations in the proportion of invasive components within some MIAC patients.

It is important to acknowledge several limitations of the present study. First, the study's retrospective design, utilization of a single central dataset, and absence of external validation may limit the generalizability of the developed models. Second, the relatively small sample size may impact the statistical power of the analysis. Third, the incorporation of deep learning and improvement of machine learning models are necessary as more data become available. Fourth, the presence of spread through air spaces, which is commonly observed in patients with invasive adenocarcinoma, is closely linked to patient prognosis. Therefore, further investigations are warranted to establish additional radiomics classifications for predicting pathological characteristics.

In conclusion, this study underscores the importance of selecting appropriate machine learning models and demonstrates the utility of multi-classification radiomics combined with clinic-radiological features in predicting the invasiveness and differentiation of pulmonary adenocarcinoma nodules. The SVM-OVO model for the multi-classification task exhibited the best overall performance and successfully predicted the histological stratification of non-invasive subtypes.