Introduction

Lumbar vertebral osteoporosis (LvOPO) and lumbar vertebral osteopenia (LvOPI) are determined by dual-energy X-ray absorptiometry (DeXA). Alternatively, they can also be evaluated by quantitative ultrasound [1], quantitative computed tomography (CT) [2, 3], and quantitative magnetic resonance imaging (MRI) [4, 5].

In LvOPO and LvOPI, the reduced bone mineral density (BMD) is usually accompanied by increase of marrow fat deposition in the lumbar vertebrae. Therefore, evaluation of lumbar vertebral fat fraction (LvFF) plays a role in better understanding LvOPO and LvOPI conceptually. It has been recently shown that patients with LvOPO have higher LvFF than those with LvOPI and healthy participants [6]. Patients with LvOPO are also older than those in the control group [7]. Whether the LvFF serves as an independent predictor in diagnosing LvOPO is clinically important. However, it has not yet been investigated until 2016 when Zhang et al. diagnosed LvOPO by the LvFF with a criterion of 0.674, achieving an area under curve (AUC), sensitivity, and specificity of 0.740, 79.2%, and 72.4%, respectively [7].

In clinical decision rules, age, body weight, history of hormone use, fracture, rheumatoid arthritis, and lack of estrogen have been used as part of predictors [8]. Body weight (≤ 70 kg), post-menopausal years (≥ 1 years), age (> 51 years), and history of fragility fracture after age 40 have been used to predict the risk of low BMD (T-score < − 2) in women aged 40–60 years, showing higher sensitivity and AUC than the osteoporosis self-assessment tool (OST) score (≤ 1) [9]. Low body weight and BMI have been used to predict osteoporosis in women aged 40 to 59 years [10]. LvFF has been shown to be positively correlated with age and negatively correlated with BMD. Nevertheless, LvFF has not been used to construct the decision rules to the best of our knowledge.

A hybrid scoring system integrating clinical information and MRI features for diagnosing a certain disease is rapidly emerging. For example, our prior study has shown that the Warthin tumor score outperforms any independent predictors in diagnosing parotid Warthin tumors [11]. We hypothesized that hybrid scores integrating clinical and MRI features also improve the diagnosis of LvOPI and LvOPO. The aim of our study was to propose a LvOPI score (LvOPIS) and a LvOPO score (LvOPOS) to diagnose female LvOPI and LvOPO, respectively.

Materials and Methods

This prospective study was approved by the Institutional Review Board of Tri-Service General Hospital with written informed consent obtained from each participant.

Patient Cohorts

A total of 101 female patients, including 53 patients before menopause and 48 patients after menopause, were recruited. All participants received BMD measures using DeXA as well as LvFF quantification. Clinical information, including age, menopause-to-magnetic resonance imaging interval (MMI), body height, body weight, body mass index (BMI), were recorded.

Measurements of Bone Mineral Density Using DeXA

BMD of the lumbar spine, including L1 to L4 vertebrae, was measured by DeXA using a Hologic QDR-4500 W (S/N 47,125) model (Hologic Inc., Bedford, MA, USA). BMD data were acquired, processed, and calculated based on the International Society for Clinical Densitometry (ISCD) guidelines [12]. To represent the overall status of the BMD of each participant, the averaged BMD was calculated by the Eq. 1:

$$ BMD_{m} \, = \,\frac{1}{4}\mathop \sum \limits_{i = 1}^{4} BMD_{i} , $$
(1)

where \(BMD\) m denotes the averaged fat fraction of L1 to L4 vertebral bodies, \(BMD\) i denotes the fat fraction of the ith lumbar vertebral body with i ranging from 1 to 4.

MRI Protocols

MRI study was performed using a 1.5 T clinical scanner (Signa HDxt, GE Healthcare, Milwaukee, WI). LvFF was measured by quantitative MRI using the iterative decomposition of water and fat with echo asymmetry and least-squares estimation (IDEAL) method on sagittal T2-weighted image with imaging parameters including repetition time/echo time (3000 ms/115.1 ms), receiver bandwidth (63.86 kHz), echo train length (20), field of view (240 × 240 mm), slice thickness (10 mm), matrix size (320 × 192), number of excitations (3), and flip angle (90 degrees). A total of 3 slices covered the lumbar spine from the T12 vertebra to the sacrum.

MRI Data Analysis

MRI imaging processing and data analysis were performed with in-house developed software by using MATLAB (MathWorks, Natick, MA). A rectangular region of interest (ROI) contouring the vertebral body without contamination of the cortical bone or subcortical marrow degeneration was drawn semiautomatically on each of three sagittal slices from L1 to L4, respectively. To eliminate potential sampling bias for the intravertebral heterogeneity of signal intensity encountered especially in the postmenopausal women [13], the signal intensity of each vertebral body was averaged. The LvFF was calculated by the Eq. 2:

$$ LvFF_{s} \, = \,\frac{{SI_{s,f} }}{{SI_{s,f} \, + \,SI_{s,w} }}, $$
(2)

where LvFFs denotes the LvFF of the sth lumbar vertebral body, SIs,f denotes the signal intensity of fat-only image in the sth lumbar vertebral body, and SIs,w denotes the signal intensity of water-only image in the sth lumbar vertebral body. To represent the overall status of the LvFF of each participant, the averaged LvFF was calculated.

Selection of Independent Predictors and Construction of Hybrid Predicting Methods to Diagnose LvOPI and LvOPO in Whole Population, Pre-Menopausal Women and Post- Menopausal Women

First, ROC curves of six potential predictors including age, body height, body weight, BMI, MMI, and LvFF in predicting LvOPI and LvOPO in whole population, LvOPI in premenopausal women, and LvOPO in post-menopausal women were plotted. Second, best cut-off criteria was determined for each predictor in each of aforementioned diagnostic tasks. Third, all 6 predictors were used to construct a hybrid scoring model after linear regression analysis for each of aforementioned diagnostic tasks and generate another hybrid predicting model using logistic regression analysis. Fourth, a threshold of AUC > 0.64 and AUC > 0.7 was applied for LvOPI and LvOPO, respectively, to select independent predictors with better diagnostic performance. Finally, three predictors (age, MMI and LvFF) were selected for hybrid models in diagnosing the LvOPI and the LvOPO for the whole population, four predictors (age, LvFF, body weight, and BMI) were selected for hybrid models in diagnosing the LvOPI for the premenopausal women; and two predictors (body height and body weight) were selected for hybrid models in diagnosing the LvOPO for the postmenopausal women.

Statistical Analysis

Statistical analyses were performed using MATLAB, SPSS Version 16.0 software (SPSS Inc, Chicago, III), SAS 9.4 (SAS Institute Inc., Cary, NC), and MedCalc Version 13.0 (MedCalc Software Inc, Ostend Belgium). The relationships between two continuous parameters were evaluated by linear regression analyses. Comparisons between two groups were performed by Mann Whitney test. Comparisons among the three groups classified based on the BMD were performed by nonparametric Kruskal–Wallis test with post hoc analysis with correction for multiple comparisons. The nonparametric receiver operating characteristics (ROC) curves were used to distinguish the LvOPO group from non-LvOPO groups and distinguishing the normal group from abnormal groups. Comparison of areas under multiple ROC curves was performed by a nonparametric test using DeLong method [14]. A P value less than 0.05 was considered as statistically significant.

Results

Patient characteristics classified by the menopausal statuses are shown in Table 1.

Table 1 Patient characteristics classified by the menopausal statuses

Scatter Plots and Linear Regression Analyses Between the BMD and Other Parameters

Scatter plots of different clinical and MRI parameters vs. BMD in whole population (Fig. 1), premenopausal group (Fig. 2), and postmenopausal group (Fig. 3) were shown. Linear regression analyses showed that the BMD was negative associated with the age, LvFF, and MMI significantly (all P < 0.005) and positively associated with the body height and body weight significantly (all P < 0.05) in the whole population. In the premenopausal group, the BMD was negatively associated with the LvFF (P < 0.05) significantly, negatively associated with the age (P = 0.05) with marginal significance, and positively associated with the body weight (P < 0.001) and the BMI (P < 0.01) significantly. In the postmenopausal group, the BMD was positively associated with the age and the body height significantly (both P < 0.05).

Fig. 1
figure 1

Scatter plots of clinical and MRI parameters, including a age, b LvFF, c MMI, d body height, e body weight, and f BMI, vs. lumbar vertebral BMD in the whole population, showing that the BMD is negatively associated with age, LvFF, plus MMI and that the BMD is positively associated with the body height and body weight

Fig. 2
figure 2

Scatter plots of clinical and MRI parameters, including a age, b LvFF, c body height, d body weight, and e BMI, vs. lumbar vertebral BMD in the premenopausal group, showing that the BMD is negatively associated with LvFF and that the BMD is positively associated with the body weight and BMI

Fig. 3
figure 3

Scatter plots of clinical and MRI parameters, including a age, b LvFF, c MMI, d body height, e body weight, and f BMI, vs. lumbar vertebral BMD in the postmenopausal group, showing that the BMD is positively associated with the age and body height

Comparisons of Parameters Among the LvOPO, LvOPI, and Normal Groups

Box and Whisker plots of the patient characteristics classified by the BMD were shown in Fig. 4. The LvOPO group was significantly older than the LvOPI group (P < 0.01) and normal group (P < 0.001), had significantly longer MMI than the LvOPI group (P < 0.01) and normal group (P < 0.001), was significantly shorter than the normal group (P < 0.05), and had significantly higher LvFF than the LvOPI group (P < 0.05) and normal group (P < 0.005). The LvOPI group was significantly older (P < 0.05) and had significantly longer MMI (P < 0.05) as well as higher LvFF (P < 0.001) than the normal group.

Fig. 4
figure 4

Box and Whisker plots of the a age, b LvFF, c MMI, d body height, e body weight, and f BMI in different patient groups classified by the lumbar vertebral BMD, showing significant difference of age, LvFF, and MMI among normal, LvOPI, and LvOPO groups and significant difference of body height between normal and LvOPO groups. Note: *, **, and *** denotes a P value less than 0.05, 0.01, and 0.005

ROC Curves of Independent Predictors and Hybrid Scores in Predicting LvOPI and LvOPO in Whole Population

ROC curves of the six independent predictors and two hybrid predicting models, including hybrid scores and logistic regression models, in diagnosing the LvOPI and the LvOPO in whole population were plotted in Fig. 5. The AUC of the age, LvFF, MMI, body height, body weight, and BMI in diagnosing the LvOPI was 0.691, 0.758, 0.67, 0.618, 0.576, and 0.536, respectively. The AUC of the LvOPIS and logistic regression models using all six predictors were 0.746 and 0.804, respectively. The AUC of the age, LvFF, MMI, body height, body weight, and BMI in diagnosing the LvOPO was 0.794, 0.802, 0.788, 0.687, 0.617, and 0.545, respectively. The AUC of the LvOPOS and logistic regression models using all six predictors were 0.833 and 0.85, respectively.

Fig. 5
figure 5

ROC curves of the clinical and MRI predictors in diagnosing the LvOPI and the LvOPO in the whole female population

ROC Curves of Independent Predictors and Hybrid Scores in Predicting LvOPI in Pre-Menopausal Women and Predicting LvOPO in Post-Menopausal Women

ROC curves of the five independent predictors and two hybrid predicting models, including hybrid scores and logistic regression models, in diagnosing the LvOPI in pre-menopausal women were plotted in Fig. 6 (left). The AUC of the age, LvFF, body height, body weight, and BMI in diagnosing the LvOPI was 0.649, 0.788, 0.5, 0.747, and 0.7737, respectively. The AUC of the LvOPIS and logistic regression models using five predictors were 0.868 and 0.898, respectively. ROC curves of the six independent predictors and two hybrid predicting models, including hybrid scores and logistic regression models, in diagnosing the LvOPO in post-menopausal women were plotted in Fig. 6 (right). The AUC of the age, LvFF, MMI, body height, body weight, and BMI in diagnosing the LvOPO was 0.546, 0.607, 0.551, 0.707, 0.708, and 0.665, respectively. The AUC of the LvOPOS and logistic regression models using 6 predictors were 0.713 and 0.767, respectively.

Fig. 6
figure 6

ROC curves of the clinical and MRI predictors in diagnosing the LvOPI in premenopausal group and the LvOPO in the postmenopausal group

ROC Curves of Hybrid Scores Using Selected Predictors Based on AUC Criteria

Table 2 shows the sensitivity and specificity of each single predictor selected for LvOPIS and LvOPOS in diagnosing LvOPI and LvOPO in the whole population, pre-menopausal group and post-menopausal group. For the whole population, parameters including age, MMI and LvFF were qualified for the LvOPIS and the LvOPOS. For the premenopausal group, parameters including age, LvFF, body weight, and BMI were qualified for the LvOPIS. For the postmenopausal group, parameters including body height and body weight were qualified for the LvOPOS. ROC curves of the LvOPIS and LvOPOS and logistic regression models in diagnosing LvOPI and LvOPO in whole population with 3 selected predictors were plotted in Fig. 5A and Fig. 5B, respectively. ROC curves of the LvOPIS and LvOPOS and logistic regression models in diagnosing LvOPI in pre-menopausal women using four selected predictors and LvOPO in in post-menopausal women using two selected predictors were plotted in Fig. 6A and Fig. 6B, respectively.

Table 2 Criteria for predictors selected to comprise the LvOPI score and the LvOPO score

Analysis of Reliability of Single Predictor and Hybrid Predicting Models

Table 3 shows the sensitivity, specificity, positive predictive value, negative predictive value, accuracy and f1 score for each single predictor and hybrid predicting methods in diagnosing LvOPI and LvOPO in whole population, pre-menopausal group, and post-menopausal group.

Table 3 Diagnostic performance of single predictors and hybrid models for LvOPI and LvOPO

Comparisons of AUC Among Single Predictors and Hybrid Predicting Models

In the whole population, the LvOPI_Logi6 model achieved the highest AUC in diagnosing LvOPI, followed by the LvOPIS6 model, LvOPI_Logi3 model, LvFF, and LvOPIS6 mode3 in a decreasing order (Fig. 5). For the diagnosis of LvOPO, the LvOPO_Logi6 model achieved the highest AUC, followed by the LvOPOS6 model, LvOPS3 mode, LvOPO_Logi3 model, and LvFF in a decreasing order (Fig. 5). Nevertheless, there was no difference between any of hybrid predicting model and the best single predictor LvFF in diagnosing either LvOPI or LvOPO (P = 0.1322 to 0.7151).

In the premenopausal group, the LvOPI_Logi4 model achieved the highest AUC in diagnosing LvOPI, followed by the LvOPI_Logi5 model, LvOPIS5 model, LvOPIS4 model, and LvFF in a decreasing order (Fig. 6). Both hybrid models using logistic regression analysis significantly outperformed the best single predictor LvFF (P < 0.05). LvOPISs signigicantly outperformed BMI, age, and body height (all P < 0.05) but not LvFF (P = 0.2736) or body weight (P = 0.0566), in diagnosing LvOPI.

In the postmenopausal group, the LvOPO_Logi6 model achieved the highest AUC, followed by the LvOPO_Logi2 model, LvOPOS6 model, LvFF, and LvOPS2 mode in a decreasing order but without significant difference between hybrid models and the best single predictor (P = 0.14 to 0.9323) (Fig. 6).

Discussion

Our study showed significantly higher LvFF in the postmenopausal women than the premenopausal women by a difference of 17% in average, consistent with studies performed by Burian et al. (21%) [13] and Sollmann et al. (18%) [15]. It can be attributed to the compound effects of the aging. The postmenopausal women in our study were older than the premenopausal women by an average of 15.5 years. Our observation is consistent with Schellinger’s study [16] and Baum’s study [17], showing increase of LvFF along with increase of age. On the other hand, estrogen has been found to have an inverse effect on the vertebral marrow fat fraction during the menstrual cycle, i.e., increase during the follicular phase and decrease during the luteal phase, and, reduction of rapid vertebral marrow fat fraction in postmenopausal women rapidly after 17-ß estradiol administration [18].

Our study showed significantly lower BMD in the postmenopausal women than the premenopausal women by a difference of − 1.46 on T-score. The difference of BMD can be attributed to the aging, which has an effect on bone loss via suppressing osteogenic programs in the bone marrow [19,20,21]. It can also be ascribed to the depletion of estrogen, which reduces bone resorption by inhibiting osteoclast formation and enhancing osteoclast apoptosis [22, 23], in post-menopausal women. Our study showed higher ratio (64.5%) of LvOPO plus LvOPI in the postmenopausal women than that (34.5%) in the premenopausal women. Our results can be explained by the mixed effects of estrogen deficiency related bone resorption acceleration and the aging related bone formation deceleration [24, 25].

Our study showed a general trend of an inverse association between the BMD and LvFF as well as an inverse association between the BMD and age in the women. Our results could be explained by the aging factor. A recent review focusing on cell autonomous changes in hematopoietic and skeletal systems showed reduced osteogenesis and increased adipogenesis in aging bone [26]. In addition to the aging factor, marrow adipose tissue itself has been recognized as an active endocrine organ [27]. Bone marrow adipocytes have been reported to be inversely associated with the BMD [28]. Verma et al. provided histological evidence by examining the adipocytic to haemopoietic tissue ratio in iliac crest biopsies [29]. Their results indicated increased volume of adipose tissue and reduced bone formation in osteoporosis. Yeung et al. examined the LvFF and fat unsaturation index using MR spectroscopy, showing that osteoporosis is associated with increased marrow fat and, especially, saturated lipids rather than unsaturated lipids [30].

Clinical information, including but not limited to age, body weight, BMI, and post-menopausal years, has been used to construct clinical decision rules such as SCORE, ORAI, OSA, OSIRIS, and ABONE to predict osteoporosis [8]. Our results showed that the age, MMI and LvFF allowed prediction of LvOPO with an AUC ranging from 0.788 to 0.802 in whole female population. By integrating these predictors, our results showed that the hybrid predicting models achieved AUC ranging from 0.81 to 0.85 higher than the single predictors.

By excluding the postmenopausal women, the influence of estrogen depletion due to ovary failure on the LvOPI can be omitted in the premenopausal women theoretically. Prior studies show that BMD is inversely associated with age [31] and LvFF [32] but positively associated with BMI [33] in premenopausal women. Our study showed that the age, body weight, BMI, and LvFF allowed prediction of premenopausal LvOPI with an AUC ranging from 0.649 to 0.788. LvFF has been used to predict LvOPO in general population containing both women and men, achieving an AUC of 0.896 [34]. By integrating these predictors, our results show that the LvOPIS achieved an AUC significantly higher than age and BMI with statistical significance, and higher than body weight with marginal significance. Showing similar AUC with LvOPIS, a hybrid predicting model using logistic regression model outperformed all single predictors, including LvFF, with statistical significance.

In the postmenopausal women, however, the age, LvFF, and MMI did not predict LvOPO as well as they did in the whole female population, neither did the LvOPOS. The poor diagnostic performance of these predictors and the LvOPOS might be attributed to the potential drawback of DeXA, which has been reported to overestimate the BMD by mistaking the osteophytes as lumbar vertebral bony structures [2, 35].

Our study has several limitations. First, the potential osteophyte-related overestimation of BMD using DeXA was not evaluated in our study. We suggest not to overemphasize our observation in postmenopausal women. Second, our study only enrolled patients older than 40 years of age in the premenopausal women. Our results should not be generalized to those younger than 40 years of ages. To better understand the performance of the aforementioned clinical and MR features in diagnosing LvOPI or LvOPO, further study enrolling a wider age range is warranted. Third, our study only enrolled Taiwanese patients from a single hospital. A multi-center study enrolling different racial patients is warranted to evaluate the generalization of our results. Finally, with the similar age distribution in the postmenopausal women, the LvFF in our study (0.74 ± 0.05; 63 ± 7 years) was apparently higher than Burian’s study (0.49 ± 0.08; 65 ± 7 years) [13] and Sollmann’s study (0.47 ± 0.09; 63 ± 6 years) [15]. The discrepancy of postmenopausal female LvFF might be attributed to either population variation or sampling bias. In 2019, Burian et al. documented that the heterogeneity of the lumbar vertebral bone marrow increased in the postmenopausal women than the premenopausal women [13]. To eliminate the potential sampling bias related to the lumbar vertebral bone marrow heterogeneity, the signal intensity averaged from the ROIs contouring the entire vertebra on three sagittal slices was intentionally used for calculation of LvFF in our study.

In conclusion, by integrating the LvFF derived from MR images and the clinical data, our proposed hybrid predicting models improve the diagnostic accuracy of the LvOPI in premenopausal Taiwanese women. Our study suggests that premenopausal LvOPI could be accurately diagnosed while receiving MR study incidentally so that patients might have a better chance to prevent further progression toward LvOPO.