Introduction

The coronavirus disease 2019 (COVID-19) continues to spread around the globe since being declared a pandemic by the WHO in March 2020, causing significant morbidity and mortality. Due to the presence of the vitamin D receptor in various types of cells and tissue, vitamin D is known for its biologic activities on many organ systems and appears to play an essential role, as an immunomodulatory agent, in the prevention of respiratory infections [1, 2].

In light of its potential implications during sepsis, vitamin D may also have a potential role in COVID-19 pathophysiology, which has several clinical features in common with severe sepsis [2].

SARS-Cov-2 infection rate has been reported to be higher in countries with low vitamin D and in patients with low vitamin D [4, 6]. Further research has demonstrated low vitamin D levels in patients with severe COVID-19 disease [3, 5]. Nevertheless, observational studies so far demonstrated conflicting results and it is imperative to have more evidence based on large population-based studies to reveal the risk of COVID-19 in populations with vitamin D deficiency [25].

Clalit Health Services (CHS), the largest healthcare organization in Israel, provides comprehensive health services to over 4.7 million members and has centrally managed electronic health records (EHR) for over two decades, including laboratory tests, diagnoses, and hospitalization records [7]. This provides a unique opportunity to study the association between vitamin D levels and SARS-CoV-2 incidence and disease severity.

Methods

Study population and data collection

We collected from the CHS data warehouse selected variables from the EHR of individuals who underwent vitamin D testing between January 1, 2010, and February 29, 2020. For each individual, we extracted the last vitamin D level as well as the last BMI measured during the period. In addition, we collected the following demographic variables: date of birth (for age calculation), gender and 3-level socioeconomic status. We also extracted coded comorbidity diagnoses (cardiac arrhythmia, asthma, congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), chronic renal failure, diabetes mellitus, hypertension, ischemic heart disease (IHD), and malignancy). All baseline variables were collected on February 2020, before the onset of the pandemic. The primary care clinic was used to associate a geographic region, and one of the three main ethnic groups living in Israel, namely general, Ultra-Orthodox, and Arab.

We used the national database for COVID-19 established by the ministry of health to collect data from patients who had a positive RT-PCR test for SARS-CoV-2 and/or had been hospitalized with the disease between March 1- and October 31, 2020. We used these data to build two matched case–control groups. The first group (A) was designed to assess factors that affect the risk of SARS-CoV-2 infection: case patients were individuals with at least one positive SARS-CoV-2 test, with the date of the first positive test taken as the index date. As controls, for each SARS-CoV-2 positive patient we matched ten individuals without a positive SARS-CoV-2 test, and of the same age, gender, geographic region, and socioeconomic status, and assigned the same index date. A second group (B) was designed to assess factors that affect the risk of severe COVID-19 in patients with a positive test for SARS-CoV-2: case patients were individuals hospitalized in severe condition for COVID-19 (World Health Organization (WHO) severity scale [7] of 6 or above) or who died from the disease, with the date of the first positive test taken as the index date. As controls, for each patient hospitalized in severe condition we matched one individual with a positive SARS-CoV-2 test, and of the same age, gender, and socioeconomic status, but who was not hospitalized.

This study was approved by the CHS Institutional Review Board (IRB) with a waiver of informed consent.

Laboratory and clinical measurements

Vitamin D levels were tested by the LIAISON 25-OH vitamin D TOTAL assay (DiaSorin USA, Stillwater, Minn), a competitive 2-step chemiluminescence assay (67). The measuring range of this assay is 10 to 350 nmol/L; analytical sensitivity is < 2.5 nmol/L, and functional sensitivity is < 10.0 nmol/L. The intra-assay precision is up to 5%, and the inter-assay precision is up to 15%. The specificity is 104% for 25-OH vitamin D2 and 100% for 25-OH vitamin D3. For patients with multiple vitamin D tests we used the last measured level during the study period. According to the measured levels, patients were assigned to four predefined vitamin D ranges (more than 75 nmol/L—normal levels; 50–75 nmol/L—insufficiency; 30–50 nmol/L—deficiency, and less than 30 nmol/L—severe deficiency).

For each patient, we used the last BMI measurement documented in the EHR during patient encounters who took place before February 29, 2020. Patients were assigned to four predefined BMI categories (less than 25, 25–30, 30–35, and more than 35 kg/m2) according to their last BMI measurement.

Statistical analysis

In descriptive tables, the statistical significance of differences observed between groups was assessed by the Fisher exact test for categorical variables, and the two-tailed Wilcoxon Mann–Whitney U for continuous variables. Conditional logistic regression models were fitted for estimating the odds ratio [OR] and corresponding 95% confidence interval [CI] of SARS-CoV-2 infection in matched group A and of severe COVID-19 disease in matched group B. In both groups, the association between vitamin D ranges and outcome was assessed first using univariable models, and then using several multivariable models, adjusted for ethnic group, BMI categories, and main comorbidities.

p values below 0.05 were considered significant. Statistical analyses were performed using R statistical software version 3.6 (R Foundation for statistical computing).

Results

Between January 1, 2010, and February 29, 2020, 1,350,000 distinct patients (about 30% of CHS members) had their serum vitamin D levels measured and these records were kept in CHS databases. From March 1 to October 31, 2020, 130,582 distinct CHS members had positive RT-PCR tests for SARS-Cov-2. Table 1 shows the comparison of the baseline characteristics of patients with positive tests (cases) vs. 4,502,455 CHS members without a positive test, who served as controls. The age and gender distribution of patients who were tested positive during this period were similar to the age and gender of the rest of the population, with a median age of 31–32 and about 51 percent of female individuals. Vitamin D levels observed in patients who were further tested positive were markedly lower: the median and interquartile range [IQR] of the last measured vitamin D levels were 47.8 nmol/L [31.4–65.2] for individuals who were further tested positive (cases) vs. 55.0 nmol/L [37.9–72.0] for controls (p < 0.001); severe vitamin D deficiency, defined by vitamin D levels below 30 nmol/L, was present in 23.1% of case patients vs. 16.0% of control patients (p < 0.001). Marked differences could also be observed for the socioeconomic levels and ethnic groups affected by the disease. BMI, which was available for 95% of the patients, was slightly higher in cases, with a median BMI of 24.5 kg/m2 for patients who were further tested positive vs. 23.5 kg/m2 for controls.

Table 1 Baseline characteristics of SARS-CoV-2-positive patients vs. other CHS members

Figure 1 displays the statistical distribution of baseline vitamin D levels among patients who were later tested positive for SARS-CoV-2 (red) vs. the rest of the population (gray) in males (top) and females (bottom). Lower vitamin D levels were observed in patients who were later infected by SARS-CoV-2 (p < 0.001), and in particular, there was a large proportion of females with vitamin D values below 40 nmol/L among the cases. Previous studies in our health organization have shown that vitamin D deficiency is particularly prevalent in the Ultra-Orthodox Jewish and Arab population (particularly female). Remarkably, these two groups were disproportionately affected by SARS-CoV-2[8].

Fig. 1
figure 1

Distribution of vitamin D measured in the blood between years 2010–2020 among individuals later infected with SARS-CoV-2 and the rest of the population. Histograms showing the distribution of vitamin D levels measured in males (top) and females (bottom). The red histogram is for individuals who were further tested positive for SARS-CoV-2, in gray the rest of the population

Having identified a high prevalence of vitamin D deficiency among SARS-CoV-2 patients, we proceeded to eliminate the possible effect of potential confounders by assessing the association between baseline vitamin D and the risk of SARS-CoV-2 infection, in a group matched for age, gender, geographical region, and socioeconomic status, with and without statistical adjustment for BMI, ethnic groups, and comorbidity. For this purpose, we built a matched group (A) of patients who had their pre-pandemic vitamin D levels measured, where each patient who tested positive for SARS-CoV-2 was matched to ten control patients of the same age, gender, geographic region, and socioeconomic status, as of February 2020, before the onset of the pandemics. A match was found for 41,757 individuals who were tested positive, with 417,570 assigned controls. The characteristics of the matched group are shown in Table 2. By design, the age, gender, regional, and socioeconomic status distribution of case and control patients were identical. The requirement of having prior vitamin D tests and the matching procedure increased the median age of this group (46 years), and the proportion of female individuals (63.5%). Even after matching for age, gender, region, and socioeconomic status, marked differences subsisted in vitamin D levels, ethnic group, and BMI distributions between cases and controls.

Table 2 Demographics and clinical characteristics of matched group (A) of SARS-CoV-2-positive patients with matched controls

Table 3 displays the conditional logistic regression results for SARS-CoV-2 infection status in the matched group. Model (1) is a univariate model based only on baseline vitamin D level ranges: compared to vitamin D levels above 75 nmol/l, which served as reference value, severe vitamin D deficiency (< 30 nmol/L) carried a higher risk of SARS-CoV-2 infection [OR = 1.442, 95% CI 1.392–1.494], but even vitamin D insufficiency (between 50 and 75 nmol/L) was associated with significantly increased risk. Model (2) is a multivariable model incorporating BMI in addition to baseline vitamin D levels. Compared with individuals with a BMI below 25 kg/m2, overweight patients had markedly increased risk for SARS-CoV-2 infection, starting at BMI range 25–30 [OR = 1.198, 95% CI 1.167–1.229], and the risk gradually increased for BMIs between 30 and 35 [OR = 1.407, 95% CI 1.365–1.451], and over 35 [OR = 1.471, 95% CI 1.418–1.526]. An inverse correlation is frequently observed between serum vitamin D levels and BMI [9, 10], and lower vitamin D levels can be attributed to sequestration of vitamin D in fat tissues. In this model it appears that the association between vitamin D and SARS-CoV-2 infection risk is maintained even after adjustment for BMI. Model (3) is a multivariable model adjusted for ethnic group, in addition to BMI. Living in an ethnic group where there is high prevalence of SARS-CoV-2 infection incurs by itself a significantly increased risk, but even after controlling for these factors, vitamin D levels were associated with a significantly increased risk. Model (4) is a multivariable model adjusted for comorbidity, in addition to BMI and ethnic group. It appears that the risk of infection is slightly increased in patients with chronic renal failure [OR = 1.154, 95% CI 1.091–1.222], congestive heart failure (CHF) [OR = 1.125, 95% CI 1.048–1.208], diabetes mellitus [OR = 1.120, 95% CI 1.086–1.156], and arrhythmia [OR = 1.088, 95% CI 1.03–1.143], while patients with malignancy appeared to have a slightly decreased risk of infection [OR = 0.932], possibly reflecting increased adherence to social distancing among these patients. The association between vitamin D levels and infection risk persisted even after adjusting for these comorbidity factors.

Table 3 Conditional logistic regression models for SARS-CoV-2 infection status in matched group (A) of SARS-CoV-2-positive patients with matched controls

After evaluating the association between vitamin D deficiency and SARS-CoV-2 infection status, we proceeded to investigate whether, among patients with evidence for SARS-CoV-2 infection, vitamin D levels were also associated with disease severity. For this purpose, we used the second matched group (B) where patients who were hospitalized with severe COVID-19 or who died from the disease and matched them 1:1 with control patients who had a positive SARS-CoV-2 PCR test but were not hospitalized for their disease. All patients in the matched group had baseline vitamin D levels, and controls were selected to match case patients’ age, gender, and socioeconomic status. 2533 patients with severe disease were assigned 2533 matched controls. The characteristics of the matched group are presented in Table 4. By design, the age, gender, and socioeconomic status distribution of case and control patients are identical. The matching procedure increased the median age of the group to 74 years and the proportion of female individuals in the matched group is 51.8%.

Table 4 Demographics and clinical characteristics of matched group (B) of COVID-19 patients hospitalized in severe condition with matched controls of SARS-CoV-2-positive individuals not hospitalized

Table 5 displays conditional logistic regression results for severe COVID-19 infection or death in this second matched group. Model (1) shows that severe vitamin D deficiency (< 30 nmol/L) carries a significantly increased risk for hospitalization with severe disease [OR = 1.777, 95% CI 1.477–2.138], and this risk was also increased for 30–50 nmol/L [OR = 1.256, 95% CI 1.057–1.492]. The association between low vitamin D levels and severe disease is maintained in multivariable models adjusted for BMI (2), ethnic group (3), and comorbidity factors (4). We observed a significantly increased risk of severe disease for infected patients with chronic renal failure [OR = 2.077, 95% CI 1.721–2.506], CHF [OR = 1.757, 95% CI 1.404–2.198], Chronic Obstructive Pulmonary Disease (COPD) [OR = 1.511, 95% CI 1.172–1.947], diabetes mellitus [OR = 1.491, 95% CI 1.300–1.709], malignancy [OR = 1.343, 95% CI 1.134–1.590], and hypertension [OR = 1.280, 95% CI 1.097–1.494]. The association between vitamin D levels and infection risk is maintained even after adjustment for comorbidity factors. Even in the fully adjusted model [4], there was a significantly increased risk for severe disease for patients with vitamin D below 30 nmol/L [OR = 1.513, 95% CI 1.230–1.861], and between 30 and 50 nmol/L [OR = 1.311, 95% CI 1.083–1.587].

Table 5 Conditional logistic regression models for severe disease or death due to COVID-19, in matched group (B)

Discussion

In this large retrospective case–control study, an inverse correlation was demonstrated between the baseline level of vitamin D and the risks of SARS-CoV-2 infection and of severe COVID-19 disease when infected. Significant associations were also found between obesity and comorbidity factors and the studied outcomes. However, even after adjusting for these factors, low vitamin D levels remained significantly associated with the outcomes.

Our report confirms the results of several other observational studies showing association between low vitamin D levels and COVID-19, particularly those suffering from severe disease [4, 12, 13, 21,22,23, 26]. In addition, northern latitude (associated with lower vitamin D levels) was found to be associated with a higher hospitalization rate for COVID-19 as well as a higher mortality rate compared with Southern latitudes [2]. Nevertheless, a cohort of working-age adults found no evidence for an independent association between low levels of vitamin D and SARS-CoV-2 seropositivity [20], and recent review concluded that there is currently insufficient information to guide the use of vitamin D as a treatment for COVID-19, as the evidence for the effectiveness of vitamin D supplementation for the treatment of COVID‐19 is uncertain, and there is only limited safety information. [24].

Several potential mechanisms may explain the observed association between vitamin D levels and SARS-CoV-2 incidence and disease severity [14]. Notably, respiratory viruses disrupt cell junction integrity [15], while vitamin D maintains cell junctions and exhibits protective effects against endothelial dysfunction and thrombosis. Furthermore, vitamin D enhances cellular innate immunity partly through the induction of antimicrobial peptides which may interfere with viral replication [16].

Our study has several unique strengths: a very large population of individuals with vitamin D levels measured in community settings before the pandemic; the accurate BMI data, as proper documentation of BMI during patient encounters was evaluated as a quality measure in our health organization; the extensive demographic and clinical data documented in the electronic health records allowing for adequate data control; and the ability to reliably identify all patients with positive SARS-CoV-2 testing and the COVID-19 hospitalization outcomes, thanks to a central database established by the Israeli Ministry of Health, and the universal use of the national ID number as patient identifier. Although having a comprehensive demographic and clinical background data, we acknowledge our study's limitations as being observational, noting the difficulty in eliminating all possible confounders. Whether vitamin D plays a causal role in COVID-19 pathophysiology or just a marker of ill health is not known, and our results should be carefully interpreted, as patients positive for SARS-CoV-2 and with severe COVID-19 had a higher number of comorbidities.

A major limitation of our study is the long time range during which vitamin D levels were measured before eventual infection or hospitalization, and lack of information on treatment with vitamin D supplements during this period. We hypothesized that patients with low levels of vitamin D who were treated with supplements performed a repeat test to monitor its level, as a blood test is recommended to monitor blood levels few months after beginning treatment [25]. We therefore extracted for each individual the latest vitamin D level available.

Interventional, randomized controlled trials are classically required to establish causality of observed statistical associations. Small clinical trials have shown promising results [17, 18], and other bigger trials are underway [19]. In the context of a rapidly spreading pandemic with a high casualty rate, the increasing body of evidence showing significantly increased risk in vitamin D deficient patients on one hand, and the relative known safety of daily vitamin D intake at recommended doses on the other hand, one may argue in favor of maintaining normal levels of vitamin D as a prevention measure, in particular for populations at risk [27]. Further large randomized controlled trials are warranted to determine if vitamin D supplementation can decrease COVID-19 incidence and its severity.

Conclusion

In this large observational population study, we show a significant association between vitamin D deficiency and the risks of SARS-CoV-2 infection and of severe disease in those infected.