A computed tomography radiomics-based model for predicting osteoporosis after breast cancer treatment

Many treatments against breast cancer decrease the level of estrogen in blood, resulting in bone loss, osteoporosis and fragility fractures in breast cancer patients. This retrospective study aimed to evaluate a novel opportunistic screening for cancer treatment-induced bone loss (CTIBL) in breast cancer patients using CT radiomics. Between 2011 and 2021, a total of 412 female breast cancer patients who received treatment and were followed up in our institution, had post-treatment dual-energy X-ray absorptiometry (DXA) examination of the lumbar vertebrae and had post-treatment chest CT scan that encompassed the L1 vertebra, were included in this study. Results indicated that the T-score of L1 vertebra had a strongly positive correlation with the average T-score of L1–L4 vertebrae derived from DXA (r = 0.91, p < 0.05). On multivariable analysis, four clinical variables (age, body weight, menopause status, aromatase inhibitor exposure duration) and three radiomic features extracted from the region of interest of L1 vertebra (original_firstorder_RootMeanSquared, wavelet.HH_glcm_InverseVariance, and wavelet.LL_glcm_MCC) were selected for building predictive models of L1 T-score and bone health. The predictive model combining clinical and radiomic features showed the greatest adjusted R2 value (0.557), sensitivity (83.6%), specificity (74.2%) and total accuracy (79.4%) compared to models that relied solely on clinical data, radiomic features, or Hounsfield units. In conclusion, the clinical-radiomic predictive model may be used as an opportunistic screening tool for early identification of breast cancer survivors at high risk of CTIBL based on non-contrast CT images of the L1 vertebra, thereby facilitating early intervention for osteoporosis. Supplementary Information The online version contains supplementary material available at 10.1007/s13246-023-01360-2.


Introduction
Globally, 1.7 million women are diagnosed with breast cancer every year, with overall survival estimated to be greater than 80% in high-income countries [1].In Taiwan, breast cancer is the most common cancer among women, and the incidence rate is increasing [2].Breast cancer therapy, including chemotherapy, endocrine therapy and oophorectomy, is associated with increased fracture risk due to deterioration of bone quantity and quality caused by decreased blood level of estrogen [3][4][5].This cancer treatment-induced bone loss (CTIBL) represents the most common long-term adverse event in breast cancer survivors and is responsible for osteoporosis and fragility fractures [3].Therefore, the early diagnosis of osteoporosis and prevention of bone mineral loss are important to further support the quality of life and life expectancy of breast cancer survivors [6].
During diagnosis and treatment for breast cancer, thoracic computed tomography (CT) is widely used to monitor disease status and treatment efficacy [7].Nonetheless, dualenergy X-ray absorptiometry (DXA) examination, the goldstandard method for assessing bone mineral density (BMD), may not always be a component of the post-treatment regimen [8], despite recommendations for BMD assessment and regular monitoring [9,10].In contrast, the regular use of CT scans in the patients who do not undergo DXA examination has been used for opportunistic osteoporosis screening and evaluation of bone loss [11,12].CT attenuation values in Hounsfield units (HU) of the lumbar vertebra calculated from the images were applied to opportunistically detect patients at high risk of osteoporosis, thereby avoiding DXArelated radiation exposure or costs [11].However, current imaging modalities, such as DXA and CT HU, only measure bone density but fail to assess bone quality, another important aspect of bone strength which takes bone architecture, turnover, damage accumulation, and mineralization into account.In contrast, radiomics, the high-throughput extraction of large amounts of image features from radiographic images, has recently attracted much attention [13].Although extensively used in clinical oncology [14], this approach has been limitedly explored for the detection and diagnosis of bone diseases [15].Radiomic features, ranging from pixel density and arrangement to textures, intensity and wavelet features, can provide a comprehensive evaluation of both bone quantity and quality.
The present study proposed novel CT radiomics-based opportunistic screening methods for detecting CTIBL in breast cancer patients.Because the chest CT scan routinely extended from thoracic inlet to upper abdomen, the L1 vertebra was always encompassed in the chest CT scan.Therefore, we hypothesized that the CT image of the L1 vertebra may be useful for the development of predictive model for treatment-induced osteoporosis in breast cancer survivors.To test this hypothesis, this study aimed: (1) to examine whether L1 T-score was significantly correlated with the L1-L4 mean T-score (gold standard); (2) to extract radiomic features from axial non-contrast CT images of the L1 vertebra for each patient and (3) to apply radiomic features and/or clinical data to build predictive models for L1 T-score and bone health.

Study participants
This retrospective study was approved by the institutional review board of the National Cheng Kung University Hospital (NCKUH).Female patients who were diagnosed with breast cancer and received a complete course of treatment and follow-up between 2011 and 2021 were initially selected.Inclusion criteria were: (i) patients who underwent post-treatment thoracic CT scan and had unenhanced CT images that showed L1 vertebra; (ii) patients who underwent post-treatment DXA scan of the lumbar spine (L1 to L4 vertebrae) and (iii) the time interval between CT scan and DXA scan was less than 1 year.Exclusion criteria were: (i) patients with lumbar scoliosis, lumbar compression fractures, or radiodense osteophytosis in the lumbar vertebrae and (ii) patients with a history of lumbar surgery.

Study variables
Subjects' demographic information and clinical characteristics were recorded from medical charts, such as age, body weight, menopause status, tobacco and alcohol use, exercise habits, comorbidities related to the fracture risk assessment tool (FRAX), T-scores of the lumbar spine derived from DXA and systemic treatments for breast cancer such as chemotherapy, endocrine therapy and targeted therapy.Commonly used endocrine therapy for breast cancer included tamoxifen (TAM) for premenopausal women to block estrogen receptors and aromatase inhibitor (AI) for postmenopausal women to lower estrogen levels.The exposure duration of endocrine therapy was also collected.The BMD was measured with GE Lunar Prodigy and the newer GE Lunar iDXA densitometers (both from GE Healthcare Lunar, Madison, WI, USA).The T-score was a standard deviation showing how much the measured bone density differed from the bone density of a healthy young adult.According to the World Health Organization (WHO), osteoporosis, osteopenia and normal subjects were defined based on mean lumbar spine (L1-L4) T-score: osteoporosis (T-score ≤ -2.5), osteopenia (-2.5 < T-score < -1.0) and normal (T-score ≥ -1.0) [16].

Radiomic feature extraction on non-contrast CT images
A total of 5 CT scanners were used for non-contrast chest CT scanning between 2011 and 2021.Of them, 2 scanners were purchased from GE Healthcare (Chicago, IL, USA) and 3 scanners were obtained from Siemens Healthineers (Erlangen, Germany).Additional manufacturer information is summarized in Supplementary Table S1.All non-contrast CT images were collected using Digital Imaging and Communications in Medicine (DICOM) format and retrieved from the picture archiving and communication system (PACS) at NCKUH.The DICOM images were saved into Neuroimaging Informatics Technology Initiative file format, and the image segmentation was performed using a selfinvented image-labeling tool running on INFINITE PACS 3.0 as previously described [17].Radiomic features were calculated using wavelet and Laplacian of Gaussian with sigma filters.
The region of interest (ROI) on axial non-contrast CT image of the L1 vertebra was manually labeled by two radiologists who were blinded to the clinicopathological details of patients.The ROI was drawn on the mid-vertebral level of the L1 vertebral body to avoid cortical bone and the basivertebral vein (Supplementary Fig. S1), as previously described [18].Quantitative radiomic features were then extracted from the ROIs using the PyRadiomics v3.0.1 package [19].All steps of ROI labeling were repeated one month later by the junior radiologist to test the intra-observer variability [20].
Radiomic features were defined according to the PyRadiomics library [17].Radiomic features for each patient were extracted three times by two radiologists (twice from the junior radiologist and once from the senior radiologist).A total of 477 radiomic features were initially extracted from the CT images of the patient population.According to the previously published criterion, the intraclass correlation coefficients (ICC) < 0.75 [20], 28 radiomic features were excluded.Therefore, 449 radiomic features with good to excellent ICC were retained for statistical analysis.
Subsequently, 8 out of 449 radiomic features with optimal lambda values were selected by the least absolute shrinkage and selection operator (LASSO) regression analysis (Supplementary Table S2).However, because of high correlation with each other, two radiomic features were further excluded.As a result, only 6 radiomic features were retained for the final statistical analysis (Supplementary Table S2).Data processing and analysis for the present study is shown in Fig. 1.

Statistical analysis
Data are presented as mean ± standard deviation (SD) for continuous variables, and n (%) for categorical variables.Spearman correlation analysis was performed to evaluate the association of L1 T-score and the average T-score of L1-L4 vertebrae derived from DXA. Results are presented in a scatter plot with the coefficient of correlation (r) and p-value.Furthermore, associations between L1 T-score and clinical data were evaluated using univariable linear regression analysis, and the results are presented as ß with 95% confidence intervals (95% CI) and p-values.Significant variables (p < 0.05) were included in the subsequent predictive model analysis.
After extraction of radiomic features, four steps were sequentially conducted to select radiomic features for building predictive models for bone mineral loss.First, the useful radiomic features were preliminarily selected using LASSO regression, in which the optimal Lamdba value was set based on minimum of mean cross-validation error.Second, because radiomic features were sensitive to variation across scanners and acquisition protocols, normalization of radiomic features was required [21,22].Therefore, each radiomic feature was normalized as the range from − 1 to 1.The normalized radiomic features were expressed as mean ± SD for the subsequent analysis.Third, the radiomic features that were highly correlated with others were excluded, and the remaining radiomic features were subjected to univariable and multivariable linear regression analyses to evaluate their associations with L1 T-score.Finally, the radiomic features with p < 0.05 identified by either univariable or multivariable analysis were retained as candidate radiomic features for building predictive models with or without clinical data.In addition, to determine whether a linear model was appropriate and to examine the assumption that the statistical model played a vital role, the residual analyses, including checking normality, independence, and testing homogeneity of variance, were performed [23].The results indicated that the statistical model fitted the data well.
Based on the multivariable linear regression results, three models for predicting L1 T-score were built, including Model I (clinical data), Model II (radiomic data) and Model III (both clinical and radiomic data).The results of predictive models were presented as ß with 95% CI and p-values, as well as adjusted R 2 statistics for each model.In order to evaluate the diagnostic accuracy of predictive models, a confusion matrix with a total accuracy, sensitivity, and specificity was calculated based on the golden standard for diagnosis of osteoporosis.Study variables were assessed using a twotailed test and a p-value < 0.05 was considered statistically significant.All statistical analyses were performed using R Statistical software (version 4.0.2;R Foundation for Statistical Computing, Vienna, Austria).

Results
A total of 412 breast cancer patients receiving post-treatment DXA examination and thoracic CT scan were included in the study.Clinical data and CT-based radiomic features were collected from all 412 patients.
Patient demographics and baseline clinical characteristics, including WHO FRAX parameters, systemic treatments, and bone health examination (T-score of L1), are summarized in Table 1.The mean T-score of L1 vertebra for all patients was − 1.05 ± 1.28.The univariable analysis revealed that the L1 T-score was significantly associated with age, weight, post menopause, smoking status, TAM exposure, and AI exposure years (all p < 0.05; Table 1).The correlation analysis showed a high correlation between the T-score of L1 vertebra and the average T-score of L1-L4 vertebrae derived from DXA (r = 0.91, p < 0.05; Fig. 2).
Results of multivariable linear regression analyses of three models for predicting L1 T-score are shown in Table 3.The model I used age, weight, post menopause, and AI exposure, whereas model II considered 3 radiomic  4).Among three predictive models, model III (clinical-radiomic model) had the greatest sensitivity (83.6%), specificity (74.2%) and total accuracy (79.4%), suggesting the superiority of model III over the other two models that relied solely on clinical data or radiomic features in predicting bone heath.
In the present study, 412 patients with a timeframe of less than 1 year between CT scan and DXA scan were included, consisting of 107 patients with 0-30 days, 93 patients with 31-90 days, 88 patients with 91-180 days, and 124 patients with 181-365 days.To evaluate the effects of time interval on predictive performance, the patients were stratified by the time interval between DXA scan and CT scan in two ways.In the first attempt, DEXA-CT.90, patients were divided into two groups using a cut-off of 90 days (0-90 days vs. 91-365 days).In the second attempt, DEXA-CT.30,patients were classified into two groups using a cut-off of 30 days (0-30 days vs. 31-365 days).For predicting T-score, the univariable linear regression analysis revealed that there were no significant differences in predictive performance between 0-90 days and 91-365 days, as well as between 0-30 days and 31-365 days (Table 5).For predicting bone health, the univariable logistic regression analysis also indicated no significant differences in predictive performance between 0-90 days and 91-365 days, as well as between 0-30 days and 31-365 days (Table 5).The results indicated no significant differences in performance of predictive models no matter the shorter or longer DXA-CT intervals.Furthermore, we conducted linear regression analysis to compare performance between the Hounsfield model and three predictive models for predicting L1 T-score (Supplementary Table S3).The results revealed that the adjusted R 2 values were 0.411, 0.452 and 0.557 for the Hounsfield

Discussion
This retrospective study utilized clinical and CT-based radiomic features collected from patients who received posttreatment CT and DXA examinations to build the predictive models for bone loss, in order to early identify breast cancer survivors at high risk of osteoporosis after treatment.A high correlation between the T-score of L1 vertebra and the average T-score of L1-L4 vertebrae derived from DXA was observed.Multivariable analysis revealed that L1 T-score was significantly associated with three radiomic features, including original_firstorder_RootMeanSquared, wavelet.HH_glcm_InverseVariance, and wavelet.LL_glcm_MCC, which were subjected to the predictive model building process.Compared to the clinical data-or radiomic featurebased predictive models, the predictive model combining clinical data and radiomic features had the highest adjusted R 2 value, suggesting a better performance in predicting L1 T-score.Consistently, the model III (clinical-radiomic model) had the greatest sensitivity, specificity, and total accuracy in predicting bone loss.Moreover, the current findings suggested that clinical-radiomic model had a better predictive performance than the Hounsfield model, and the length of time interval between DXA and CT scans did not affect performance of predictive models.
With DXA as reference, opportunistic screening for osteoporosis using CT attenuation values derived from the CT scan has been reported [8,11,21,22].Notably, Pickhardt et al. assessed BMD using CT and DXA for identifying osteoporosis, with an emphasis on L1 measures [24].Recently, Park et al. found that compared to DXA, CT attenuation values can be used for predicting osteoporosis and discriminating incidental fracture risk in breast cancer patients [8].A meta-analysis by Ahern et al. demonstrated that the HU was a clinically useful tool to aide in the diagnosis of osteoporosis; however, determining the optimal HU cut-off was troublesome [25].In contrast, the present retrospective study developed a novel approach with the high throughput radiomic features for opportunistic screening for bone loss in breast cancer survivors after treatment, providing a comprehensive evaluation of bone quantity and quality, and no optimal cut-off values of radiomic features were required.In the present study, the positive correlation between L1 T-score and the average T-score of L1-L4 measured using DXA was first demonstrated, and the predicted L1 T-score was then used for predicting bone loss.CT radiomics-based predictive models were subsequently built.A recent study demonstrated that radiomics analysis based on lumbar spine CT scans was an effective method to screen for osteoporosis, with a greater net benefit than the Hounsfield model [26].In agreement with this finding, the present study found that sensitivity, specificity and accuracy of Model III (clinical and radiomic data) were higher than those of the Hounsfield model, demonstrating the effectiveness of the clinical-radiomic predictive model for the early detection of patients with bone loss after breast cancer treatment, without additional radiation exposure or cost due to DXA.
Endocrine therapy is a standard treatment for hormone receptor-positive breast cancer and is associated with a significant reduction in disease recurrence and improvement in overall survival [27], but endocrine therapy is also associated with an increased risk of osteoporosis and osteoporotic fracture, particularly among patients receiving AI therapy [28].The present study found that the L1 T-score was positively associated with TAM usage and negatively associated with AI exposure duration, consistent with literature about the impact of endocrine therapy on bone loss [26,28].A Canadian retrospective study found a reduced risk of osteoporotic fracture associated with TAM usage over time in postmenopausal patients with early-stage breast cancer [28].A prospective substudy of the Anastrozole, Tamoxifen, Alone or in Combination (ATAC) trial reported that anastrozole was significantly associated with accelerated bone loss over time in postmenopausal women with breast cancer [29].Detailed clinical data with endocrine exposure duration included made our results more convincing.
In oncology, radiomics is primarily used for the noninvasive estimation of a clinical diagnosis or prognosis [14].However, Jiang et al. recently aimed to validate a radiomic signature based on CT scans to screen for lumbar spine osteoporosis [26].The signature model demonstrated excellent prediction performance for osteoporosis, suggesting that this methodology may facilitate surgical decision-making without additional medical costs and radiation exposure [30].The above result is consistent with the current study, as well as with other recent publications utilizing MRI or CT scans, supporting radiomic models based on lumbar spine images to detect osteoporosis [30,31].
Three key radiomic features associated with bone loss identified in the current study were original firstorder_ RootMeanSquared, wavelet-HH_glcm_InverseVariance and wavelet-LL_glcm_MCC.RootMeanSquared (RMS) is defined as the square-root of the mean of all the squared intensity values, which indicates the magnitude of the image values.In the multivariate analysis, the beta value of this radiomic feature was 0.775, so RMS was positively correlated with the L1 T-score.The greater the RMS (the greater magnitude of the image values), the higher the L1 T-score (the higher the bone density, or the more solid and complete the bone structure).The larger the Inverse-Variance value, the smaller the variance and vice versa.In the multivariate analysis, the beta value of this radiomic feature was 0.218, so InverseVariance was positively correlated with the L1 T-score.The larger the InverseVariance (the smaller the variance), the higher the T-score (the higher the bone density, or the more solid and complete the bone structure).The maximal correlation coefficient (MCC) is a measure of complexity of the texture.In the multivariate analysis, the beta value of this radiomic feature was 0.197, so MCC was positively correlated with the L1 T-score.The healthy bone with intact cortex and medulla has a higher MCC value.In contrast, the bone with osteopenia (or osteoporosis) has a lower MCC value.These radiomic features are clinically interpretable and can provide a deep look at the bone quality.
To provide a comprehensive view of the radiomic model's performance a β with 95% CI and R 2 (coefficient of determination) were used, metrics that are widely used to evaluate linear regression models.The β coefficient measures the strength of the linear relationship between the predictor and response variables, and the 95% CI provides an estimate of the confidence of the model's predictions.The R 2 statistic measures the proportion of variance in the response variable that is explained by the predictor variables and it is a useful measure for assessing the overall accuracy of the model [32].Worth noting, area under the curve (AUC) is not typically used for linear regression models, and the current study primarily focused on the prediction of L1 T-score by linear regression models.
Strengths of the current study include the pilot construction of a radiomics-based model for predicting L1 T-score in breast cancer patients, opportunistic use of post-treatment thoracic CT for screening CTIBL and the speed of the screening tool utilized.The inherited limitation of this retrospective, single-institution pilot study was small sample size.Thus, all data were used to build predictive models in the present study.Although the validity of predictive models was determined by residual analysis, the generalization of current findings is difficult due to the lack of additional dataset (test dataset).Therefore, large-scale multicenter studies are warranted to evaluate generalization of the clinical-radiomic predictive model as an opportunistic screening tool.In addition, the present study demonstrated that no significant differences in performance of predictive models no matter the shorter or longer DXA-CT intervals.However, the possibility that small sample sizes lead to non-significant statistical results cannot be ruled out.Hence, to specifically exploring the effect of the longer time interval on performance of clinical-radiomic model, large-scale multi-center studies are warranted.

Conclusion
DXA is not routinely performed during breast cancer treatment in clinical practice, so treatment-induced osteoporosis in breast cancer survivors might be overlooked without prompt management to prevent further bone loss and possible osteoporotic fracture.This study developed a novel clinical-radiomic model for predicting L1 T-score and bone health with superior predictive performance than the Hounsfield model.The proposed clinical-radiomic model may be used as an opportunistic screening tool to help identify breast cancer survivors at high risk of osteoporosis, without DXA-related medical costs and radiation exposure, to achieve early detection and intervention.

Fig. 1
Fig. 1 Flow diagram of data processing and analysis

Fig. 2
Fig. 2 Correlation analysis between T-score of L1 vertebra and the average T-score of L1-L4 vertebrae derived from DXA

Table 1
WHO FRAX World Health Organization fracture risk assessment tool, TAM tamoxifen, AI aromatase inhibitor

Table 2
Associations between L1 T-score and radiomic featuresUnivariate linear regression analysis was conducted, and the results were presented as coefficient (ß) with corresponding 95% confidence intervals (95% CI) and p-value.For multivariable analysis, all 8 radiomic features were proceeded via multivariable linear regression analysis and the results were presented as ß with 95% CI and p-value

Table 3 Multivariable
models for predicting L1 T-score based on clinical data, radiomic data, or combination of clinical + radiomic data The results were presented as ß) with 95% CI and p-value Bold p-value indicated significance (p < 0.05)

Table 4
Diagnostic performance of models for predicting bone health Bone health was categorized as normal (T-score ≥ -1) and bone loss (T-score < -1), Sensitivity (True positive rate) = Truly predicted bone loss/truly observed bone loss, Specificity (True negative rate) = Truly predicted normal bone health/truly observed normal bone health, Total accuracy = (True positive + True negative)/Total population

Table 5
Performance comparison between prediction models with different time intervals between CT and DAX scans.Univariable linear regression models for predicting L1 T-score.Univariable logistic regression models for predicting L1 bone health