Introduction

Vitamin D supplements are widely used, either self-administered or with a prescription [1]. The widespread use is not surprising as vitamin D supplementation for groups at risk for vitamin D insufficiency is advised in most international and national guidelines. These guidelines find their base in the potential harmful effect of low vitamin D status [2,3,4]. Traditionally, vitamin D deficiency is associated with the diseases rickets and osteomalacia. Nowadays, these conditions are rare in daily practice. Many vitamin D-related association studies and intervention trials have been conducted in the last decades. The observational studies have shown that vitamin D status is inversely correlated with multiple diseases and conditions such as fractures, falls, cardiovascular disease, different forms of cancer and psychiatric disorders. However, intervention trials have so far only shown positive effects of vitamin D supplementation on bone-related problems and not on other outcomes [5]. Supplementation of vitamin D in combination with calcium can reduce loss of bone mineral density and reduces fracture risk [6,7,8,9,10]. Reduction of fractures by supplementation of both calcium and vitamin D is particularly clear in the oldest group frail persons [11]. Vitamin D with or without calcium supplementation might also reduce falls, but meta-analyses have shown contradictory results [12,13,14,15,16,17].

The serum 25-hydroxyvitamin D (25(OH)D) concentration is the best marker of vitamin D status. The threshold for the need of vitamin D supplementation is still under debate. While some advice treatment at 25(OH)D levels of < 25–40 nmol/L, others claim that this should be < 75 nmol/L [18, 19]. The Health Council of the Netherlands considers a level > 50 nmol/L sufficient for women older than 50 years and men older than 70 years and advises daily vitamin D supplementation for this entire group [4]. The Institute of Medicine (IOM) has proposed the same threshold. The IOM presumes that the average requirement of the population is covered with serum 25(OH)D levels of 40 nmol/L [2].

When using 50 nmol/L as cut-off, about 50% of the Dutch older persons has a vitamin D insufficiency [20, 21]. Considering that only half of all older persons have serum 25(OH)D levels below the recommended threshold, a general advice to supplement vitamin D in these elderly is only useful for half of the older population. In the other half of the population, treatment would not be necessary, depending on the season of blood collection. When lower thresholds are used, the number of overtreatment would be even more obvious. The consequences are medicalization and costs [22].

Testing of serum 25(OH)D levels can reduce the number of unnecessary treatments but is also expensive and the outcome is influenced by the period of the year. Three prediction models to predict vitamin D deficiency in elderly were published before [23,24,25]. In an analogous manner, we wanted to develop a prediction model to predict insufficient vitamin D status in older women. In contrast to previous studies, our study focuses on the clinical implementation of the model. Therefore, we focused on older women with risk factors for fractures and evaluated clinical applicability. In the second place, we used a spectrum of thresholds, in a way that medical professionals can choose the threshold they consider to be appropriate.

The goal of this study was to develop a validated prediction model that could discriminate older women who do or do not need treatment with vitamin D supplements. We aimed to construct a simple tool that can be added to fracture risk evaluation in general practice or that can be used as a self-test. While there is still discussion about the optimal cuff of value of 25(OH)D levels, we will develop the model with different thresholds for serum 25(OH)D levels, namely 30, 40, 50 and 60 nmol/L.

Methods

For the development of our model, we followed the instruction of the TRIPOD [26]. The steps of our method are presented schematically in Fig. 1.

Fig. 1
figure 1

Scheme of the development and validation of the prediction model

Study population

Participants were selected from a randomised prospective trial regarding the detection and treatment of older women at risk for fractures in general population and subsequent treatment with anti-osteoporotic medication, the SALT Osteoporosis Study (SOS) [21]. In this study, women with ages between 65 and 90 years were enrolled from GP-practices surrounding five laboratory locations in Noord-Holland, a province in The Netherlands (Northern Latitude 52°). Women with at least one risk factor for fractures (fracture after the age of 50 years, parental hip fracture, low body weight, immobility and conditions that may cause secondary osteoporosis) who had been randomised to the intervention group of the study were evaluated with a questionnaire, bone densitometry and laboratory examination [27]. Informed consent was obtained from all individual participants and the study was approved by the Dutch Health Council (OGZ 2.978.265).

For the development of the prediction model, we used the baseline data from all women in the intervention group collected in the period from March 2010 until February 2013. An additional vitamin D-related questionnaire was filled out by the participants that were included during this period. The women who had their examination in Amsterdam were excluded because in this laboratory, a different 25-hydroxyvitamin D assay was used. Furthermore, we excluded Black, Arabic, Turkish and Asian women (n = 103) and women in a residential care home (n = 24), resulting in a total of 2689 participants. The reason for excluding these women was that there were too few participants in these groups and most of them already had an indication for vitamin D supplementation.

For the external validation of the prediction model, we collected data from a second population in the same region. All women between 65 and 90 years old who had been referred by a general practitioner for combined dual X-ray absorptiometry (DXA) and laboratory examination from September 2010 until February 2013 were asked to participate and to complete the same questionnaires (n = 856). This was a non-overlapping population in the same region that was evaluated in the same laboratory.

Questionnaires

All women completed two questionnaires. The first questionnaire contained questions about risk factors for fractures and osteoporosis. If a questionnaire was returned incomplete, the participant was contacted by telephone to complete the missing answers. The second questionnaire contained questions about predictors of vitamin D status. The questionnaires were collected at the laboratory visit when DXA and blood tests were performed.

Laboratory analyses

Serum 25(OH)D was analysed with a chemoluminescence assay (Diasorin, Stillwater, MN, USA) as a routine measurement at the Centrum voor Medische Analyse Antwerpen. The laboratory is accredited and meets the European norm ISO 15180. The inter-assay coeficient of variation was 10.1% at a 25(OH)D concentration of 61.5 nmol/L and 9.9% at a 25(OH)D concentrations of 36 nmol/L.

Outcome

In the models, we used a dichtome outcome: below and equal to or above the threshold level of 25(OH)D of, respectively, 30, 40, 50 and 60 nmol/L.

Predictors of vitamin D status

The selected potential predictors were age and body mass index (BMI), period of the year of blood sampling, immobility, use of walking aid, falling during previous 12 months, vitamin D supplementation either self-administered or prescribed, multivitamin supplementation, calcium supplementation, smoking, consumption of fatty fish, consumption of margarine, alcohol use, time spending outside in winter and summer, the level of education and polypharmacy.

Age was calculated as date of examination minus date of birth. Weight and height were measured during the visit of the laboratory and BMI was calculated as kg/m2.

All other questions were self-reported. Patients were considered to be immobile when severe problems with walking were indicated (yes/no). Use of walking aid, current smoking, falls (at least 1 in the last 12 months), supplementation of vitamin D, calcium and multivitamin use (daily) were dichotomous questions (yes/no). In a second question, the participants were asked to specify whether the supplementation of vitamin D was on prescription or self-administered.

Time spent outdoors in winter and summer, level of education, alcohol use, fatty fish and margarine use were measured in four categories. For the model, these categories were dichotomized.

Medication use was collected with the question to list all actual medication. We defined polypharmacy as the use of six drugs or more. We used individual dates of blood samples for serum 25(OH) measurement to determinate the period of the year.

Model development

The development of the prediction model was preceded by the preparation of some potential predictors. Before modelling, we evaluated collinearity of the variables. There was none. We used splines to evaluate non-linear relationships between continuous predictors and the outcome and to modify continuous variables into categories. In order to reduce the variables for the period of the year, we observed the pattern of seasonal change of 25(OH)D levels over the months during 3 years. We used univariate linear regression of 25(OH)D levels to find an optimal reflection of the seasonal influence in a maximum of four periods and selected the model with the highest R2 and the least amount of periods.

To select the predictors for the model, we performed logistic regression analyses. We used a backward selection strategy. That means that we started with all potential predictors in the model, and every turn, the predictor with the highest P value was excluded from the model, until all P values were lower than our selection criterion of p < 0.157 (Akaike’s information criterion). A stricter p value leads to the development of models that are closely adapted to the data and that generalise poorly.

Since missing data can effect a prognostic model, we planned to use multiple imputation to estimate the missing values according to the Multivariate Imputation by Chained Equation procedure in IBM SPSS Statistics 20, only in case of more than 5% missing cases. Due to few missing cases, imputation was not necessary.

Model performance

To evaluate the discrimination of the model, i.e., if the model is able to distinguish patients with and without an insufficient 25(OH)D status, a receiver operator characteristic (ROC) curve was made and the area under the roc curve (AUC) calculated. The goodness-of-fit of the model was tested by the Hosmer-Lemeshow test. The unexplained variance was indicated by Nagelkerke’s R2.

Internal validation was performed with bootstrapping techniques. Regression coefficients and performance of the models were adjusted according to the optimism estimates from the internal validation procedure.

We constructed a risk score by multiplying the regression coefficients of the predictors by 10 and divided by 3. The result was rounded to the nearest whole number. The last step was made to keep the scores low to keep the model practical. To check the loss of information, we compared AUC before and after dividing by 3. We calculated the sensitivity, specificity, positive predictive value and negative predicted value for the defined thresholds.

For the external validation, we calculated the AUC of models and the practical model based on the risk score in a second population.

Examples of application

We shall show two examples how the prediction model can be used for a practical purpose. In the first place, we will produce figures with predictive values for the different thresholds. That might be useful for predicting the probability for deficiency in daily practice. In the second place, we apply this tool to estimate the proportion of women at risk for 25(OH) insufficiency that need supplementation in a specific period of the year.

Software

IBM SPSS Statistics version 22 was used for data checking and modelling and R 3.2.2 was used for bootstrapping and assessment of the final models.

Results

Population characteristics

Between March 2010 and February 2013, we collected questionnaires and serum 25(OH)D values of 2689 women. Of these women, 2624 (97.6%) had complete data; hence, imputation was not necessary. Serum 25(OH)D levels ≥ 60, ≥ 50, ≥ 40 and ≥ 30 nmol/L were found in 32, 50, 65 and 90% of the women, respectively. The population for external validation comprised 856 women. The main characteristics of both populations are shown in Table 1.

Table 1 The prevalence of the determinant in the study population, the regression coefficients of the models with different cut-offs and corresponding risk scores

Modelling of seasonal changes

Figure 2 shows the seasonal change of the mean serum 25(OH)D levels. The model with the highest R2 and the lowest number of periods was a model with three variables: the nadir in December till April, a peak in July and August and intermediate levels in the rest of the months.

Fig. 2
figure 2

Mean serum 25(OH)D levels per month. Error bars show 95% CI of the mean

Development of the prediction models

Irrespective of the chosen threshold, the best prediction model contained the following predictors: age, BMI, walking aid, vitamin D supplementation either self-administered or prescribed, multivitamin use, calcium supplementation, smoking, time spent outdoors in summer and period of blood sample. At thresholds of 30 and 40 nmol/L, use of margarine and fatty fish were predictors. From a threshold of 50 nmol/L, use of fatty fish was not a predictor and use of margarine was a very weak predictor and disappeared from the model at higher thresholds. Falling on the other hand was a predictor at the higher thresholds but not at 30 nmol/L. Time spent outdoor in winter was a predictor at thresholds of 40 and 50 nmol/L. Table 1 shows the β’s of the different predictors at the different thresholds.

Validation of the prediction model

The internally validated models showed an AUC of 0.72 to 0.77 with the highest for the threshold 30 nmol/L. There is little loss of information when the model is converted to a risk score. The R2 is 0.25 for the threshold of 40 nmol/L. The AUC of the external validation was between 0.71 and 0.82 (Table 2). Figure 3 shows the sensitivity, specificity, positive predictive value and negative predictive value of the model. The risk scores for the different models are shown in Table 1.

Table 2 AUC of the ROC curve and Nagelkerke R square of the internal validated models and the simplified models for the different 25(OH)D thresholds
Fig. 3
figure 3

Four models with different thresholds of serum 25-hydroxyvitamin D. The Y-axis shows the positive predictive (PPV), negative predictive value (NPV), sensitivity, specificity and prevalence for any computed risk score (X-axis) of participants in primary population

Application of the risk score

To use the model, one has to choose a threshold. Figure 3 shows the predictive value for the different thresholds. For example, when a threshold of 50 nmol/L is desired and a patient has a risk score of 0 or lower, this patient has a 70% probability to have 25(OH)D levels above the threshold.

Table 3 shows the proportion women at risk for 25 (OH)D insufficiency in the primary population without vitamin D supplementation. The proportion is presented for the different thresholds, a desired probability and per season. For example, to have a probability of 80% of having 25(OH)D levels above 50 nmol/L, 100% are at risk in winter, 94% in spring and autumn and 79% in summer. This shows that the model could only differentiate few women who do not need treatment at this threshold. However, at a threshold of 40 nmol/L, the model selects 88% at risk in winter and 50% in spring and autumn. What means that the model is able to select 50% of the population that would not need treatment most of the year. It is clear that at a threshold of 30 nmol/L, even more women not at risk could be selected but nihil at a threshold of 60 nmol/L.

Table 3 Application of the models: percentage of women at risk for 25(OH)D insufficiency per season for different thresholds of 25(OH)D and probability to reach the threshold

Discussion

In this study, we present four validated models for the prediction of vitamin D status for different 25(OH)D thresholds between 30 and 60 nmol/L. In our primary sample and in an external sample, reasonable predictive values were shown. The evaluation of the models with different thresholds elucidates that the model with a threshold of 60 nmol/L has less predictors and lower prediction than the other models. Furthermore, we have shown that the models with thresholds of 50 and 60 nmol/L have low negative predictive value and therefore are less potent to exclude low 25(OH)D levels in our population, due to the high prevalence of 25(OH)D levels lower than these thresholds. The highest AUC were seen at thresholds of 30 and 40 nmol/L.

Predictors for insufficient 25(OH)D in all models were higher age, higher BMI, use of walking aid, little time spent outdoors in summer, smoking, no calcium supplementation, no use of multivitamins, no use of vitamin D on prescription or self-administered and the period of the year when serum 25(OH)D was tested. Some predictors were not in all models. The intake of fatty fish is only present in the models with threshold 30 and 40 nmol/L, respectively. The use of margarine is a predictor in the models with thresholds from 30 to 50 nmol/L, but is a weak predictor at 50 nmol/L. This suggest that food-related vitamin D intake helps to prevent the severest deficiency, but not to maintain higher values.

Most predictors are similar to those observed in earlier studies. The AUC in other studies for a threshold of 50 nmol/L range from 0.73 to 0.86 [17,18,19]. Only one study (AUC of 0.71) was validated in an external population [17]. The age of the participants in these studies was similar, but there was difference in the prevalence of 25(OH)D deficiency. In contrast to our model, other studies added more complex predictors that consist of more than one question or a physical examination.

We have observed that the use of calcium supplements is associated with higher 25(OH)D levels, which can be explained by a lower vitamin D metabolite turnover due to reduction of the parathyroid hormone levels [28].

With respect to vitamin D supplementation, it is remarkable that vitamin D on prescription is a stronger predictor than self-administered vitamin D. There are two possible explanations. First, the compliance might be better when vitamin D is given on prescription. Secondly, it is possible that patients obtain lower dosages without prescription.

The period of the year is one of the strongest predictors in our model. As our data covers 3 years, we observed that there were substantial differences per year. In most studies, the change in serum 25(OH)D levels is supposed to be a symmetric sinusoidal curve over the seasons. However, in our data, the mean levels over 3 years show a pattern in which the rise of serum 25(OH)D in spring is steeper than the decrease in autumn, a narrow peak in summer and a wide dip in winter. The result of this pattern is that the optimal model does not exactly follow the seasons. The peak is in the months of July and August; the dip is from December until the end of April. The other months are between these summer and winter levels. A similar pattern has been reported earlier in a Danish cohort of postmenopausal women [29].

A strength of this study is that it was performed in a large population of women at risk for fractures and that it was validated in an external population of women with comparable age. In the external validation, the loss of discrimination was limited or absent, indicating that the model is robust.

There are different opinions about the optimal threshold for serum 25(OH)D [18, 19]. Also, one might differ about whether one prefers a high specificity with a high positive predictive value, but more false negatives or one prefers a high sensitivity with a high negative predictive value but more false positives. Therefore, we have presented our data in a way that medical professionals can choose between different thresholds and between different positive and negative predictive values.

The selection of a population of women with increased fracture risk has advantages and disadvantages. A profit is that we have exact insight in the performance of the model in a population in which 25(OH)D levels are clinically most important. A disadvantage is that the application of the model has only been validated in Dutch Caucasian women with increased fracture risk, and therefore, we must be careful with a broader implementation of this model. Unfortunately, we did not have enough data of non-Caucasian women living in the Netherlands to gain more insight in these specific groups.

Another limitation is the lack of information on the dose of vitamin D supplementation. Furthermore, a large part of variance is not explained. The unexplained variance might be reduced by improvement of the accuracy of the measurements, e.g., with the use of standardised intake questionnaires or the measurement of ultraviolet exposure. However, for our purpose, a simple tool for daily practice, more complex measurements are not suitable. One simple measurement that lacks in our model that might improve the performance of the model is physical activity [23, 24]. Information on physical activity was not available in our study, but there may be an overlap with impaired mobility, the use of a walking aid and the time that is spent outdoors [23]. Time spent outdoors in summer as well in winter are positive predictors. Because of the low intensity of the sun in winter, we do not think that this is a direct effect, but that it might be a reflection of physical activity and general health as well [31].

The 25 (OH)D assay we used was not compared with the reference of the national institute of standards and technology. Nevertheless, we used a single 25(OH)D assay with good performance [30].

In conclusion, the proposed model can help to distinguish women with adequate serum 25(OH)D levels in a population of older women with risk factors for fractures. This might reduce unnecessary treatment with vitamin D in some relatively healthy older women. To reach levels above the 50 nmol/L during the whole year, practically all women need supplementation and many of them need continuous supplementation. If one allows levels of 30 or 40 nmol/L, the model can substantially reduce the number of women that need supplementation, at least during a part of the year. This shows that advices in guidelines can have substantial impact on the number of women with an indication for supplementation.