Study design and population included
A retrospective cohort was built using the databases of the public healthcare system in Catalonia. We analyzed all individuals ≥ 18 years old living in Barcelona and Central Catalonia regions on 25 February 2020, date of the first positive PCR for SARS-CoV2 in our country (n = 4,643,139).
In this population, we performed three independent studies to investigate the association of cholecalciferol or calcifediol supplementation with COVID-19 outcomes:
-
Comparison of COVID outcomes between supplemented patients and propensity score-matched controls: We identified all patients receiving cholecalciferol (n = 201,445) or calcifediol (n = 207,136) supplementation from 1 April 2019 to 28 February 2020 and patients not receiving any vitamin D supplement (4,267,430) during the same period. Since chronic kidney disease (CKD) is a strong predictor of worse prognosis in COVID-19 [27], subjects without an available serum creatinine determination performed between 1 October 2018 and 28 February 2020 were excluded from the study. After propensity score matching (see below), 108,343 patients on cholecalciferol, 216,686 matched controls (cholecalciferol controls), 134,703 patients on calcifediol and 269,406 matched controls (calcifediol controls) were selected for the analysis.
-
Association between mean daily cholecalciferol or calcifediol dose and COVID-19 outcomes: All patients receiving cholecalciferol or calcifediol supplementation from 1 November 2019 to 28 February 2020, with an available serum creatinine level (n = 165,588 and 132,590, respectively), were selected for this analysis. This shorter period of time was chosen to minimize the effects of eventual changes in the dose of these drugs.
-
Comparison of COVID-19 outcomes between cholecalciferol- or calcifediol-supplemented patients with a sufficient vitamin D status (serum 25OHD > 30 ng/ml) and unsupplemented vitamin D-deficient (serum 25OHD < 20 ng/ml) patients: In order to reduce the variability in serum 25OHD levels due to seasonal sun exposure, we only analyzed serum levels determined between 1 November 2019 and 28 February 2020. All patients of the cohort that had a serum 25OHD determination in this period of time (n = 85,158) were included in this analysis.
Data sources
Given Catalonia’s universal health and medication coverage, we were able to utilize electronic databases to examine the association of cholecalciferol and calcifediol use with COVID-19 outcomes in a real world setting. We used anonymized data provided by the Catalan Agency for Health Quality and Evaluation (AQUAS) within the framework of the Data Analytics Program for Health Research and Innovation (PADRIS). PADRIS databases include information on demographics (age and sex), diagnoses, laboratory data, drugs supplied by pharmacies, Primary Care physician diagnoses, laboratory results and diagnoses, procedures and outcomes of medical admissions in the public hospitals in Catalonia. This project was approved in a public call for grants for using PADRIS databases in research projects on COVID-19.
Identification of patients on cholecalciferol or calcifediol supplementation
Patients who had been supplied in pharmacies with drugs of the Anatomical Therapeutic Chemical Classification System groups A11CC05, A12AX, M05BB03, M05BB07, M05BB08, M05BB09, A11CC06 or A11CC55 from 1 April 2019 to 28 February 2020 were analyzed. The sum of Defined Daily Doses (DDD) of cholecalciferol or the sum of calcifediol doses supplied from 1 November 2019 to 28 February 2020 were identified, transformed into micrograms, and the mean daily cholecalciferol or calcifediol dose received per patient, in micrograms, was calculated. Patients receiving formulations containing > 250 μg of cholecalciferol (12.5 DDD) or > 250 μg of calcifediol per dose were considered as receiving bolus doses.
Identification of control subjects through propensity score matching
We performed two independent propensity score matching to build the control groups for cholecalciferol and calcifediol using the 'Matching' package in R [28] as described [26]. First, we used multivariate logistic regression to model receiving or not each drug as a function of the following covariates: sex, age, fifteen comorbidities identified from the International Classification of Diseases (ICD-10) diagnostic codes issued by family physicians (Supplementary Table 1), estimated glomerular filtration rate (eGFR), history of cigarette smoking, nursing home residence and use of seven classes of drugs that could potentially affect the prognosis (Supplementary Table 1). Estimated glomerular filtration rate was obtained from serum levels of creatinine, sex and age according to the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation [29]. Propensity scores were matched using the nearest-neighbor matching method without replacement at a 1:2 ratio of treated subjects and controls. A caliper of 0.2 of the standard deviation of the propensity score logit was established as the maximum tolerated difference between matched patients. To examine the balance of each covariate between the treatment and the control group, the standardized mean difference was calculated before and after matching using Tableone package in R [30]. We considered the groups well balanced if the standardized mean difference was < 0.10 for each covariate.
Serum levels of 25-hydroxyvitamin D
Serum levels of 25OHD determined in the laboratories of the catalan public health system between 1 November 2019 and 28 February 2020 in the whole cohort were obtained from PADRIS databases. A deficient vitamin D status was defined as a serum 25OHD level < 20 ng/mL and a sufficient vitamin D status was defined as a serum 25OHD level ≥ 30 ng/mL.
Outcome variables
We analyzed the occurence of SARS-CoV2 infection, COVID-19 hospitalization, intensive care admission, the procedures during hospitalization and mortality during the first wave of the pandemic. Four main outcome variables were defined, with different timings due to the natural course of the disease:
SARS-CoV2 infection
Positive PCR result for SARS-CoV2 or a clinical diagnosis made by a Primary Care physician, or a hospital discharge report stating a diagnosis of COVID-19 (ICD-10 codes used are displayed in Supplementary Table 1), from 25 February 2020 to 30 April 2020. Time (in days) from 24 February 2020 until a positive PCR or a clinical diagnosis (the first event) was used for survival analysis. Censored time for those individuals without the event was the time from 24 February to 30 April 2020.
COVID-19 mortality
Patients diagnosed with COVID-19 infection resulting in death between 25 February and 15 May 2020. Patients with COVID-19 admitted to hospital before 16 May 2020 and dying before 7 June were also included. Time (in days) from 24 February 2020 to COVID-19 death was used for survival analysis. Censored time for those individuals without the event was the time from 24 February to 7 June 2020.
Severe COVID-19
Composite outcome of COVID-19 mortality, as already defined, or COVID-19 hospital admission needing non-invasive mechanical ventilation, orotracheal intubation, mechanical ventilation or intensive care unit admission from 25 February 2020 to 15 May 2020. Time (in days) from 24 February 2020 until hospital admission (if severe COVID-19 developed during hospitalization) or time (in days) from 24 February 2020 until COVID-19 death was used for survival analysis. Censored time for those individuals without the event was the time from 24 February to 7 June 2020.
Statistical analysis
Continous variables are reported as mean and standard deviation and qualitative variables are summarized by frequencies and percentages. Basal differences between treated and untreated groups were assessed using Student’s t test or chi-square test and standardized mean differences.
Once the control groups were established, associations between cholecalciferol or calcifediol supplementation and outcome variables were further analyzed using unadjusted and multivariate Cox proportional hazards regression models. All the variables that approached statistical significance (p < 0.2) were initially selected for inclusion in the adjusted analyses. Multivariable models were constructed by means of a stepwise forward inclusion procedure and only the significant variables were retained in the final model. Unadjusted and adjusted hazard ratios and their 95% confidence intervals are reported.
Taking into account that vitamin D supplementation may be prescribed to treat a low vitamin D status, we also compared the outcome variables between patients with sufficient vitamin D status, while being vitamin D-supplemented, with patients deficient in vitamin D and not supplemented, also using multivariate Cox regression analysis. Finally, the associations between the mean daily cholecalciferol or mean daily calcifediol dose and COVID-19 outcomes were also analyzed using multivariate Cox regression analysis.
For all statistical tests, a p value < 0.05 was used for statistical significance.
Descriptive statistics and survival analysis were carried out using SPSS version 25.0 for Windows (SPSS, Chicago, IL, USA), and Survival and Survminer packages in R [31, 32].
Ethical issues and confidenciality
All data were treated anonymously in order for this study to comply with the provisions of Spanish and European laws on Protection of Personal Data. The study was approved by the ethics committee of the Corporació Sanitària Parc Taulí-Universitat Autònoma de Barcelona.