Introduction

Non-small-cell lung cancer (NSCLC) is the most common type in lung cancer (85–90%) and is one of the most common cancers worldwide. Although NSCLC showed an indolent clinical course trend, bone metastasis remains an important issue. NSCLC with initial bone metastasis usually presented in 30–40% of lung cancer patients [1].

The high risk of bone metastasis significantly increased the mortality of NSCLC patients and reduced the patients’ quality of life [2, 3]. As a result, identifying bone metastasis in NSCLC has become a major focus of research. Numerous factors associated with bone metastasis prediction in NSCLC have been identified, including elevated serum hepatoma-derived growth factor levels, the expression of L1-cell adhesion molecule, and high osteopontin expression [4,5,6]. Based on these factors, various prediction models have been developed, such as nomograms and machine learning models, which demonstrate a certain degree of accuracy in predicting bone metastasis. However, these models have limitations due to the restricted number of predictive factors considered, resulting in an inability to fully meet clinical needs.

In light of these challenges, our research seeks to improve the prediction of bone metastasis in NSCLC by focusing on risk factors derived from medical imaging, which offer significant predictive advantages over traditional clinical cohort risk factors. Unlike conventional approaches that rely primarily on biochemical markers and histological features, medical imaging provides a comprehensive and non-invasive assessment of tumor characteristics and bone integrity. Advanced imaging techniques, such as CT, MRI, and PET scans, can reveal detailed structural and functional information that is often not captured by traditional methods. By integrating these imaging-based risk factors into our predictive models, we aim to enhance the accuracy and clinical applicability of bone metastasis predictions. This approach not only broadens the spectrum of predictive factors but also leverages the high-resolution, quantitative data from medical imaging to improve early detection and intervention strategies for NSCLC patients at risk of bone metastasis. Our research thus represents a significant step forward in addressing the limitations of current prediction models and meeting the clinical needs for more reliable and comprehensive diagnostic tools.

Before the establishment of artificial intelligence, it is impossible to capture unreadable tumor phenotypic characteristics from images. Computed-Tomography (CT) is the first-line noninvasive imaging method for preoperative assessment for lung cancer, which can quantify tissue density, including diagnosis, treatment planning and surveillance [7]. However, the intrinsic characteristics of tumors required further study. Recently, several studies reported the feasibility and superiority of lung cancer features detection from CT images [8,9,10].

Radiomics is the process of high-throughput mining of quantitative image features. The revealed features can be potentially applied in a clinical decision support system to improve the diagnostic, and prognostic prediction accuracy [11, 12]. Biomarkers extracted from images were reported to be with effective predict performance across a range of cancer types [13].

In this study, with the radiomics method, we extracted the risk biomarkers of NSCLC from preoperative CECT images. We aimed to establish a non-invasive individualized bone metastasis prediction nomogram in NSCLC. The nomogram was established based on the revealed CECT features and clinicopathological risk factors. The nomogram can be potentially applied in clinical trials, precision medicine practices, and tailored clinical therapy.

Materials and methods

Patients

This retrospective study was approved by the institutional review board of Tianjin Medical University Cancer Institute & Hospital and the requirement to obtain informed consent was waived (clinical trial: bc2021008). For the data cohort, we collected 318 CECT images of 318 patients between January 2009 and December 2019.

The study inclusion criteria were as follows: (i)the tumor was pathologically diagnosed as NSCLC; (ii)bone metastasis was diagnosed by imaging, at least one credible imaging evidence was gained (X-ray, CT, Magnetic Resonance Imaging, Emission Computed Tomography or positron emission tomography CT); (iii) surgical resection or other treatment such as chemotherapy, targeted, or immunotherapies was not performed before CT scan. (iv) the baseline clinical data as well as original high-resolution CT images were available. The exclusion criteria included the following: (i)the pathological diagnosis of the primary lung site was uncertain; (ii) bone metastasis diagnosis was uncertain; (iii) the lung tumor was unclear in CT images. (iv) the baseline clinical data as well as CT images were incomplete or unavailable; (v) the pathological type of lung cancer was unknown.

The patients were divided into two groups according to the occurrence of bone metastasis. In total, 318 patients were identified and randomly divided into training cohort (132 males and 92 females; mean age, 59.7 ± 8.9 years; range, 36 to 86 years) and validation cohort (48 males and 46 females; mean age, 59.5 ± 8.6 years; range, 32 to 78 years). (Fig. 1)

Fig. 1
figure 1

A general flowchart of patient inclusion and exclusion. A total of 318 NSCLC patients were finally included from 13,830 patients. NSCLC non-small-cell lung cancer, CECT Contrast enhanced computed tomography

Clinical characteristics

Baseline clinicopathological data, including age, sex, blood group, smoking history, drinking history, history of malignancy, family history of malignancy, single or multiple lung tumors, lymph node status, metastatic status of other organs, Karnofsky (KPS) score and tumor pathology type, were derived from the medical records, the original high-resolution CT images were collected.

CT acquisition

A total of 13,775 patients with NSCLC from Tianjin Medical University Cancer Institute & Hospital were collected. Original high-resolution CECT images were found in 502 patients.

CT was performed at our institution with a GE CT scanner (GE 750), all scan parameters were shown in Supplement table 1. The patients were in a supine position with their hands above their heads. The contrast agent iohexol was injected through the elbow vein at a flow rate of 3 ml/s and a dose of 1.5-2 ml/kg. The scanning range was from the entrance of the chest to the costophrenic angle. The patients were scanned with a single breath. All images were stored in IMA for radiomics analysis.

CT segmentation and feature extraction

The primary lung tumor was delineated manually on ITK-SNAP (ITK-SNAP 3.8.0; www.itksnap.org). All manual segmentation was performed by two radiologists and one radiologist (twice). The radiologists have more than five years of lung cancer imaging working experience. The CT images with ROI were exported to inn format. Radiomics features have the capacity to capture tumor phenotype differences by examining a large set of quantitative features. The textural, morphological, intensity, law, and wavelet features were performed automatically by using open-source software (Pyradiomics; TTP://pyradiomics.readthedocs.io/en/latest/index.html) [14, 15]. The radiomics features set included was described in detail in Supplement 2.

Feature selection

Feature selection for the radiomics signature was performed with Kruskal-Wallis, Spearman and LASSO in version 3.6.2 in R [16]. Spearman’s correlation coefficient was used to calculate the correlation and redundancy of elements. The features related to thyroid carcinoma were selected by the least absolute shrinkage and selection operator (LASSO) logistic regression method [17]. A Rad-score was generated using a linear combination of selected features weighted by the LASSO algorithm. The Mann-Whitney U test was then used in the training and validation cohort to assess the potential association between Rad-Scores and bone metastasis.

Development of CT radiomics nomogram

The clinical risk factors associated with bone metastasis were conducted by univariate analysis. Multivariable logistic regression analysis including Rad-score and independent clinical variables was conducted to select the final predictors of bone metastasis. Based on the binary analysis in the training group, a CT nomogram was developed. For comparison, invalid features were excluded in the process of clinical model construction, and only independent clinical risk factors were used to develop clinical prediction models. The discriminative ability of the model was evaluated with Harrell’s concordance index (C-index) and receiver operating characteristic (ROC). The calibration curve (1000 bootstrap resamples) was performed to evaluate the calibration ability of the nomogram.

Statistical analyses

The calibration curve and Hosmer-Lemeshow test were used to evaluate the calibration performance of the CECT nomogram. AUC was used to evaluate the discrimination performance of the nomogram. For clinical use, the nomogram prediction probability for each patient is calculated according to the nomogram algorithm. Statistical analysis was conducted using R software 3.6.2 and SPSS23.0 (SPSS, Inc., Chicago, IL). All statistical significance levels two-sided, P values of less than 0.05 (two-sided) were considered statistically significant.

Results

Clinical characteristics

Patients’ clinical, pathological and radiological characteristics in the training and validation cohorts were summarized in Table 1. To evaluate the predictive value on bone metastasis of the extracted biomarkers from images, a total of 318 patients with clear imaging of lung tumor, complete clinical data, and pathological type were involved in the present study. Among the involved 318 patients, 150 cases were diagnosed with bone metastasis (47.17%) while 168 cases were without bone metastasis (52.83%).

Table 1 Clinical characteristics of patients in the training and validation cohorts

Establishment of CT radiomics signature

Based on the 223 CT images in the training group, 2,570 features were extracted. The number of features was reduced to twelve by the LASSO algorithm in the training cohort (Fig. 2). We evaluated the predict efficacy of CT features. The CT Rad-scores were higher in the patients with bone metastasis in both the training and validation sets than those in the patients without bone metastasis, with the AUC of 0.699 (0.637–0.761) vs. 0.684 (0.584–0.783) in the training group (Fig. 3).

Fig. 2
figure 2

The illustration of segmentation, feature extraction, and modeling creation using radiomics. The workflow of radiomics (A). Radiomics feature selection using the LASSO logistic regression model in the training cohort (B). The value of lambda values that gave the minimum average binomial deviance was used to select features. Dotted vertical lines were drawn at the optimal values by using the minimum criteria and the 1 standard error of the minimum criteria (the 1-SE criteria). As a result, optimal lambda values resulted in 12 nonzero coefficients

Fig. 3
figure 3

The validation of the prediction modelling. Performance of the radiomics for predicting bone metastasis in training and validation cohort

Bone metastasis predictors identification and evaluation

Logistic regression analysis of clinical and imaging semantic data, the risk factors being significantly associated with bone metastasis in NSCLC patients were blood type, distant organ metastases, pathological type, spicule sign, lobulation sign, pleural indentation sign, and ground-glass opacity/note (Table 2).

Table 2 Univariate regression analysis of patients with bone metastasis

Multiple logistic regression analysis suggested that four factors were correlated with the occurrence of bone metastasis in NSCLC, including distant organ metastasis (OR = 3.003, P = 0.001), spicule sign (OR = 1.858, P = 0.019), lobulation sign (OR = 3.492, P < 0.001), and ground-glass opacity/note (OR = 1.928, P = 0.014) (Table 3).

Table 3 Multi-factor regression analysis of patients with bone metastasis

Nomogram development and validation

Rad-Score, distant organ metastasis, spicule sign, lobulation sign and ground-glass opacity/note were confirmed as the independent predictors of bone metastasis in NSCLC by multivariate logistic regression. Although the pathological type of tumors was not an independent risk factor after multivariate logistic regression, it was still included in the prediction model for its reported significant role in NSCLC. A CT radiomics nomogram incorporating these predictors was constructed (Fig. 4). The radiomics nomogram achieved an AUC of 0.745, P < 0.001 in the training cohort and 0.808, P < 0.001 in the validation cohort (Fig. 5). The prediction accuracy of the prediction model was improved using 1000 replicate samples. The calibration curve showed satisfactory calibration in the training and validation cohort. (Fig. 4; Table 4)

Fig. 4
figure 4

CECT radiomics nomogram for the predicting estimation of bone metastasis. (A) Nomogram to estimate the risk of bone metastasis preoperatively in NSCLC. To use the nomogram, find the position of each variable on the corresponding axis, draw a vertical line to the points axis for the number of points, add the points from all of the variables, and draw a line from the total points’ axis to determine the bone metastasis probabilities at the lower line of the nomogram. (B, C) Calibration curves of the CECT radiomics nomogram in the training (B) and validation (C) set. Calibration curves depict the calibration of CECT radiomics model in terms of the agreement between the predicted probabilities of bone metastasis and observed outcomes of bone metastasis. The dotted blue line represents an ideal prediction, and the dotted black line represents the predictive ability of the nomogram. The closer the dotted black line fit is to the dotted blue line, the better the predictive accuracy of the nomogram is

Fig. 5
figure 5

Prediction accuracy of the CECT radiomics nomogram for the estimation of bone metastasis in patients with NSCLC in the training and validation cohorts. ROCs show good prediction performance of the nomogram in the training and validation cohorts. ROC receiver operator characteristic

Table 4 Comparison of prediction models for bone metastases in NSCLC

Discussion

Bone was one of the most common metastatic sites in lung cancer [18]. Bone metastasis from lung cancer can cause severe skeletal disease, including bone pain, pathological fracture, spinal instability, spinal cord compression, hypercalcemia and other skeletal related event (SRE) [19]. Bone metastasis usually occur at an advanced stage and decrease patient’s quality of life. Among NSCLC patients, the incidence of bone metastasis was around 40% [20, 21]. NSCLC patients with bone metastasis showed a one-year survival rate of 40–50% [22]. Bone metastasis screening in patients with NSCLC is important for treatment decisions.

To date, the mechanism of bone metastasis in lung cancer remains to be elucidated. Seed-soil theory is widely accepted [23]. As the latest study reported, distant metastasis might be an early event with the occurrence of the primary tumor. The primary tumor pre-selected the tumor cells which can be regulated by osteomimicry [24]. Thus, the features of the primary tumor can be of significance on studying distant metastasis. Our study suggested the CECT radiomics of the primary tumor can reflect bone status. Radio-semantic features, including spicule sign, lobulation sign, and ground-glass opacity/note, were proved to be highly associated with bone metastasis in NSCLC. Many investigators demonstrated that the arose of spicule sign and lobulation sign in ground-glass opacity/note was because tumor cells seldomly grew at the same rate in all directions, with the infiltration of tumor cells into the adjacent bronchial vascular sheath or local lymphatic vessels [25]. Therefore, it was speculated that these highly active “seeds” could affect both tumor progression with cell proliferation and aggressiveness, including leaving the primary site, entering the bloodstream, and the development of bone metastasis.

Till now, there has been no effective strategy on bone metastasis screening or early diagnosis. Several prediction systems were established, none of them can satisfactorily meet the clinical need [26]. Our previous research reported the feasibility of distant metastasis prediction through the various features of the primary site [27]. Thus, we attempted to develop a CECT-based radiomics nomogram for the preoperative prediction of bone metastasis in patients with NSCLC. It was widely accepted that tumors are angiogenesis-dependent diseases. Angiogenesis is one of the essential factors leading to tumorigenesis, progression and distant metastasis [28]. CECT technology, which has a superior ability to reflect microvascular perfusion, containing different aspects of anatomical and biological information about a tumor, has been applied con-jointly for the diagnosis of lung tumors in clinical practice. CT-based radiomics showed its ability on differential diagnosis between benign and malignant lung nodules [29]. However, to the best of our knowledge, no studies investigated the association between the CECT radiomics features and bone metastasis in NSCLC.

In exploring advanced methodologies to enhance our predictive model for bone metastasis in NSCLC, it is valuable to consider recent developments in machine learning. The latest study demonstrated how triplet loss can improve feature representation, which can be applied to radiomics to distinguish more effectively between patients with and without bone metastasis [30]. Additionally, “Weakly supervised machine learning” offers techniques to leverage imperfect data, enhancing our model’s robustness when high-quality labeled data is scarce [31]. Lastly, “Deep learning in food category recognition” highlights the benefits of information fusion from multiple data sources. Integrating such fusion techniques can improve the accuracy of our predictive models by combining radiomic features with diverse clinical data, thus offering a more comprehensive approach to early detection and treatment planning for NSCLC patients [32].

Clinicopathological risk factors, including the organ distant metastasis, pathology type were previously reported to be associated with bone metastasis in NSCLC [24]. To provide an easy-to-use tool for clinical use, we developed a radiomics nomogram based on the multivariate logistic regression analysis. The nomogram established with the combined features from CT and clinical features showed significantly better bone metastasis prediction than any nomogram established with the single feature. Radiomics strategy was recently applied in other imaging modalities including ultrasound or CT images, the regional lymph-node metastasis in certain cancers can thus be predicted [33, 34]. Compared with the previous study, our study yielded a predict performance by concentrating on the clinical parameter, pathological parameters combined radiomics method, which can complement image features with more information on patient’s risk.

The superior performance of our proposed method can be attributed to several key factors. First, both the radiomic and clinical features were integrated. Unlike many state-of-the-art approaches that rely solely on either imaging data or clinical data, our model integrates radiomic features from contrast-enhanced CT scans with comprehensive clinical features. This multi-modal approach allows for a more holistic view of the patient’s condition, capturing both the detailed tumor phenotype and relevant clinical indicators, leading to improved prediction accuracy. Second, the advanced feature selection technique was used. We employed robust feature selection methods, including the Kruskal-Wallis test and LASSO regression, to identify the most relevant radiomic features associated with bone metastasis. This rigorous selection process ensures that only the most predictive features are included in the model, enhancing its overall performance. Third, the high-quality imaging data were included. The use of high-resolution contrast-enhanced CT scans provides detailed and quantitative imaging data, which is critical for accurate radiomic analysis. The high-quality imaging data contributes significantly to the model’s ability to detect subtle differences in tumor characteristics that may indicate bone metastasis. These factors collectively contribute to the improved performance of our proposed method compared to existing state-of-the-art approaches. We believe that the integration of diverse data sources, rigorous feature selection, and validation processes are key to achieving better predictive outcomes in clinical settings.

To enhance the predictive performance of radiomics models, future studies should integrate multiple imaging modalities like Positron Emission Tomography (PET), and Single Photon Emission Computed Tomography (SPECT). These provide complementary information on tumor microenvironment and bone integrity, potentially improving accuracy. Exploring longitudinal changes in radiomic features can reveal insights into metastasis progression and treatment response. Advanced machine learning techniques, such as deep learning and ensemble learning, can further optimize feature selection and prediction. Finally, validating models in larger, multi-center cohorts is essential for ensuring generalizability and clinical applicability.

Our study has several limitations. First, the research was a retrospective and single-center study, which made it hard to evaluate the generalizability of outcomes to other cohorts. Prospective studies of larger-scale patients need to be performed before its extensive practical application. Although CT image segmentation was performed by experienced radiologists, the subjectivity of the radiologist may influence the result.

Conclusion

In conclusion, for the first time, this study revealed the primary lung cancer CT image features being correlated with bone metastasis in patients with NSCLC. This signature would offer the auxiliary strategy on bone metastasis early identification and screening in NSCLC. Individualized treatment and metastatic screening plans can be generalized to the patients with high risk based on the established nomogram. The nomogram showed its feasibility and potential clinical application on bone metastasis prediction in NSCLC.