1 Introduction

Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), is a highly contagious, fast-spreading, and life-threatening infectious disease [1]. In humans, COVID-19 results in respiratory infection, which can be represented from asymptomatic to severe conditions, such as acute respiratory distress syndrome and multiple organ dysfunction syndrome. Although paxlovid was authorized emergency use by FDA, the effective treatment against COVID-19 is still limited in the clinic [2,3,4,5]. Thus, identifying potential risk factor for COVID-19 would provide substantial benefit for public health.

In the context of COVID-19 pandemic, multiple predispositions have been discovered including obesity [6], smoking [7], diabetes [8], and autoimmune diseases [9]. For example, smoking, a well-known susceptible factor of respiratory infections, has been confirmed to be closely linked to the infection of COVID-19 [5, 10, 11]. Recently, Daniel et al. revealed that former smokers with a SARS-CoV-2 positive test had a higher percentage of hospitalizations and death within 14 days. An increased risk of severe COVID-19, which was defined with the requirement for ICU admission owing to the development of the dyspnea and acute respiratory distress syndrome, had also been commonly observed in former smokers [12]. On the other hand, since obesity has been shown to increase vulnerability to influenza A and H1N1 influenza, it may be a risk factor for COVID-19 transmission. Interestingly, a retrospective study unveiled that the BMI of the critical group was remarkably higher than that of the general group. Among the non-survivors, 88.2% of patients had a BMI > 25 kg/m2, which has significantly surpassed the survivors of 18.9% [13].

Ethanol is often used for elimination microorganisms consisting of virus, bacterium, fungus, and germ. During the current COVID-19 population, the conceptions about the protective effects of alcohol in preventing COVID-19 have appeared in social media [14, 15]. Despite that the World Health Organization recommended that alcohol consumption may not destroy SARS-CoV-2 and delay disease progression because of the immunosuppression [16], a large number of people around the world still believe that drinking alcohol contributes to prevent COVID-19 [14, 17]. Consequentially, 477% increase in online alcohol sales was reported by the end of April 2020 compared to the same period of previous years [18]. Although several studies attempted to investigate the relationship between alcohol consumption and COVID-19, the association has not been conclusively established [14, 19]. To date, it is thus critical to explore whether alcohol drinking is causally related to COVID- 19 infection, hospitalization, and severity.

Mendelian randomization (MR) affords an effective method to test the nature of the correlation between specific exposure and outcome. Given the random assortment of genetic variants during meiosis, confounding factors are anticipated to be equally distributed among different genotypes. The single nucleotide polymorphisms (SNPs), associated with certain exposure from genome-wide association study (GWAS) are considered as instrumental variables which should fulfill the following three key assumptions [15, 16]: (a) relevance assumption: significantly related to the exposure; (b) exchangeability assumption: independent of potential confounders; (c) exclusion assumption: not associated with the outcome (Fig. 1). Recently, a large-scale meta-analysis comprised up to 1.2 million European individuals was conducted to elucidate the genetic architecture of alcohol consumption that presents an opportunity for us to address the MR approach [20].

Fig. 1
figure 1

Schematic representation of the core assumptions of the instrumental variables in the MR study. SNPs single nucleotide polymorphisms

2 Materials and Methods

2.1 Genetic Instruments for Alcohol Consumption

Summary-level of alcohol consumption was obtained (N = 941,280) from the GSCAN study [20], which is the recent largest GWAS on alcohol consumption in European population. The instrumental variables for MR analyses were selected based on the following criteria: (a) r2 < 0.001 at 500-Mb window for LD; (b) P value less than the genome-wide significant level (5 × 10−8 for alcohol consumption phenotype); (c) minor allele frequency > 0.01; (d) nonpalindromic SNPs (A/T and G/C polymorphisms) were removed. We also used the MR-PRESSO method to identify and eliminate pleiotropic SNPs [21]. If the SNP derived from alcohol consumption was not found in the outcome summary data ("COVID-19"), it will also be eliminated.

2.2 Study Populations

Summary genetic association estimates for the risk of COVID-19 were derived from the most recent COVID-19 GWAS of UK Biobank released on January 18, 2021 (https://www.covid19hg.org/results/). We compared five phenotypes of the cohort as follows: (1) COVID-19 patients vs. the general population (38,984 patients vs. 1,644,784 control participants); (2) hospitalized COVID-19 patients vs. not hospitalized COVID-19 patients (3159 patients vs. 7206 patients); (3) hospitalized COVID-19 patients vs. the general population (9986 patients and 1,877,672 control participants); and (4) severe COVID-19 patients vs. the general population (5101 patients vs. 1,383,241 control participants). COVID-19 and the general population represents individuals who are infected or not infected with SARS-CoV-2, respectively. Hospitalized COVID-19 denotes patients infected with SARS-CoV-2 and has been admitted to the hospital. Severe population indicates a group of SARS-CoV-2 infected patients who have used a non-invasive or invasive ventilator. All COVID-19-related GWAS summary statistic data were based on the European ancestry population.

2.3 Inverse Variance Weighted (IVW)

The principal analyses were conducted using the IVW, assuming that all SNPs are valid instrumental variables. Utilizing the Wald Ratio approach, IVW determines the exposure-outcome effect associated with each SNP before running a weighted linear regression [22]. IVW methods include the fixed-effects IVW and the random-effects IVW. If the heterogeneity exists in the MR analysis, we will apply the random-effects IVW which is prone to modify weaker bias of the SNP-exposure association [23].

2.4 Genetic Risk Score (GRS)

To validate the IVW results, a secondary analysis was used by GRS method. We conducted the analyses utilizing R (version 3.5.3) with the “gtx” R package (version 0.0.8 for Windows), whose grs.summary module has the GRS function. The grs.summary module employed single SNP summarized data obtained from the GWAS analysis, which is similar to the regression of additive GRS [24]. For uncorrelated SNPs, the causal α value can be estimated by \(\alpha \approx \frac{\sum\upomega \beta {\mathrm{se}}_{\beta }^{-2}}{{\sum }^{{\upomega }^{2}}{\mathrm{se}}_{\beta }^{-2}}\), and the standard error seα can be estimated by \({\mathrm{se}}_{\alpha }\approx \frac{1}{{\sum }^{{\upomega }^{2}}{\mathrm{se}}_{\beta }^{-2}}\). Here, ω denotes the assessment of intermediate trait, and β value evaluates the response variable [24].

2.5 Sensitivity Analysis

2.5.1 Simple Median and Weighted Median

The simple median and weighted median methods are frequently used to estimate causal effect from summary data. These methods have the high tolerance for pleiotropic genetic variation that can obtain relatively stable effect values even when nearly half of the instrumental variables are invalid. The key distinction between the two methods lies in their management of estimated medians, with the simple median method assigning equal weight to all values and the weighted median method incorporating weight for each value [25].

2.5.2 MR-Egger

The MR-Egger method checks and corrects the bias caused by the pleiotropy of instrumental variables by fitting the regression model with the correlation effect of SNP-outcome and SNP-exposure. If the instrumental variables do not have horizontal pleiotropy, the intercept of the model should be zero; thus the intercept can be used as a statistic test to detect whether the instrumental variables have pleiotropy. The slope of the MR-Egger regression reflects the causal correlation effect of exposure on the outcome after corrected the bias of pleiotropy. MR-Egger regression allows genetic variants to have pleiotropy, but it must satisfy the hypothesis that the pleiotropy effect of genetic variations on outcome is Instrument Strength Independent of Direct Effect (InSIDE). When the InSIDE hypothesis is established, the MR-Egger regression can estimate the causal effect, but the accuracy of the result and the efficiency of statistical test will be reduced [26]. MR-Egger intercept P < 0.05 indicates the presence of directional pleiotropy.

2.5.3 MR Pleiotropy Residual Sum and Outlier (MR-PRESSO)

MR-PRESSO identifies horizontal pleiotropy by detecting outliers among the included genetic variants that contribute to the MR estimate. This method assumes that at least 50% of the genetic variants are valid genetic variants, holding horizontal pleiotropy and the InSIDE assumption. In addition to identifying outlier genetic variants, MR-PRESSO method can also provide adjusted estimation after removal outlier genetic variants [21]. In conclusion, the MR-PRESSO approach has the following three primary purposes [16]: (1) “MR-PRESSO global test” to identify the extent of horizontal pleiotropy; (2) “MR-PRESSO outlier test” to exclude aberrant genetic variants (outliers) and estimate the corrected results; (3) "MR-PRESSO distortion test” to assess whether the discrepancy exists between the pre-corrected and corrected outcomes.

2.5.4 MR-Robust Adjusted Profile Score (MR-RAPS)

The MR-Robust adjusted profile score (MR-RAPS) with a Huber loss function can model the random-effects distribution of pleiotropic effects. Taking into account both systematic and idiosyncratic pleiotropy, the MR-RAPS method showed outstanding performance in numerical patterns. It is highly recommended as a practical tool for regular MR analysis, especially when dealing with complex traits that involve exposure and outcome [27].

2.5.5 Cochran’s Q Test

Cochran's Q statistic commonly used in meta-analysis to evaluate heterogeneity among studies. Fabiola et al. proposed its application for MR analysis in 2015. If the P value of the Q statistic is greater than 0.05, it indicates that there is no significant heterogeneity in the included SNPs, and the influence caused by heterogeneity should not be addressed in the results [28].

2.5.6 F Statistics

Weak instrumental variable bias is a considerable problem in MR study. Weak instrumental variables refer to limited explanatory power in relation to exposure. Insufficient sample size is generally the primary cause of weak instrumental variable bias. On this basis, F statistics is applied to evaluate the effect of weak instrumental variables, and its specific calculation formula is as follows:

$$F = \frac{N - K - 1}{{\text{K}}} \times \frac{{R^{2} }}{{1 - R^{2} }}\quad R^{2} = 2 \times \left( {1 - {\text{MAF}}} \right) \times {\text{MAF}} \times \frac{{\beta^{2} }}{{{\text{SD}}^{2} }}\quad {\text{SD}}\, = \,{\text{SE}} \times \sqrt N .$$

Here, N represents the number of samples studied by exposed GWAS. K represents the number of instrumental variables, and R2 represents the degree of exposure explained by instrumental variables (the determining coefficient of the regression equation). β is the effect size of SNPs on exposure, and SD is the standard deviation. F statistic below 10 is typically indicative of weak instrumental variables, which may introduce bias into the results. At this point, we need to be cautious in the interpretation of the results.

A two-sided statistical analysis was conducted, and statistical significance was determined at P < 0.05. R version 4.1.2 and the packages "MendelianRandomization", "TwosampleMR", “MR-RAPS” and “MR-PRESSO” were used for all analyses [29].

2.6 Sample Overlap

Traditionally, if partially overlapping of exposure and outcome datasets existed, the results may be biased and produce Type 1 error [30]. In our study, the exposure and outcome data were taken from two non-overlapping datasets (alcohol consumption GWAS and COVID-19 GWAS), which potentially avoided magnification the biased and spurious causal estimates.

3 Results

3.1 Instrumental Variables Selection

A total of 99 SNPs in the GWAS study reached significant differences (5 × 10–8). Eighty-two independent instrumental variables without LD were screened according to the criteria. However, there are two SNPs whose information is not available at the outcome (COVID-19). Therefore, 80 valid instrumental variables for alcohol consumption were ultimately used in the following MR analysis. We summarized the genetic association data of alcohol consumption in Supplementary Table 1.

3.2 IVW

As shown in Table 1, the MR analysis showed no statistically significant relationship between alcohol consumption and COVID-19 (IVW: beta = 0.0372; 95% CI − 0.1817 to 0.2561; P = 0.74), hospitalized COVID-19 compared with not hospitalized COVID-19 (IVW: beta = − 0.3625; 95% CI − 1.4151 to 0.6900; P = 0.50), hospitalized COVID-19 compared with the general population (IVW: beta = − 0.1203; 95% CI − 0.5997 to 0.3591; P = 0.62), and severe COVID-19 compared with the general population (IVW: beta = 0.2963; 95% CI − 0.3682 to 0.9607; P = 0.38) (Figs. 2, 3a–d).

Table 1 Causal association of alcohol consumption with COVID-19 via MR analyses
Fig. 2
figure 2

Risk of COVID-19 vs. general population, hospitalized COVID-19 vs. not hospitalized COVID-19, hospitalized COVID-19 vs. general population, severe COVID-19 vs. general population for genetically predicted alcohol consumption. COVID-19 coronavirus disease 2019, IVW inverse-variance weighted, MR-PRESSO Mendelian Randomization Pleiotropy RESidual Sum and Outlier, MR-RAPS Mendelian Randomization robust adjusted profile score

Fig. 3
figure 3

Scatter plot showing the associations of the SNP effects on alcohol consumption against the SNP effects on COVID-19. a The associations of the SNP effects on alcohol consumption against the SNP effects on COVID-19 vs. general population; b the associations of the SNP effects on alcohol consumption against the SNP effects on hospitalized COVID-19 vs. not hospitalized COVID-19; c the associations of the SNP effects on alcohol consumption against the SNP effects on hospitalized COVID-19 vs. general population; d the associations of the SNP effects on alcohol consumption against the SNP effects on severe COVID-19 vs. general population. COVID-19 coronavirus disease 2019, MR Mendelian randomization, SNP single nucleotide polymorphism

3.3 GRS

To validate the results of the IVW approach, we further conducted a secondary MR analysis using the GRS protocol to investigate the relationship between alcohol consumption and COVID-19. Consistent with the IVW results, the GRS revealed no causal relationship between alcohol consumption and COVID-19 (beta = 0.0372; 95% CI − 0.1737 to 0.2481; P = 0.73), hospitalized COVID-19 compared with not hospitalized COVID-19 (beta = − 0.3625; 95% CI − 1.3633 to 0.6383; P = 0.48), hospitalized COVID-19 compared with the general population (beta = − 0.1203; 95% CI − 0.5352 to 0.2946; P = 0.57), and severe COVID-19 compared with the general population (beta = 0.2963; 95% CI − 0.3240 to 0.9166; P = 0.35) (Table 2; Fig. 4a–d).

Table 2 The effect of the GRS instrument of alcohol consumption on COVID-19
Fig. 4
figure 4

Genetic risk score GRSalcohol for COVID-19. a Genetic risk score GRSalcohol for COVID-19 vs. general population; b Genetic risk score GRSalcohol for hospitalized COVID-19 vs. not hospitalized COVID-19; c genetic risk score GRSalcohol for hospitalized COVID-19 vs. general population; d genetic risk score GRSalcohol for severe COVID-19 vs. general population. The estimated effects on COVID-19 risk (vertical axis) are plotted against the estimated effects on alcohol consumption (horizontal axis). COVID-19 coronavirus disease 2019

3.4 Sensitivity Analysis

3.4.1 Weighted Median, Simple Median, MR-PRESSO and MR-RAPS

Four sensitivity analysis methods were used in the study as follows: weighted median, simple median, MR-PRESSO and MR-RAPS. Similar results were found in the sensitivity analyses. There was no statistically significant evidence of a relationship between of alcohol consumption with COVID-19 (weighted median: P = 0.67, simple median: P = 0.22, MR-PRESSO: P = 0.74 and MR-RAPS: P = 0.73, Table 1; Figs. 2, 3a–d), hospitalized COVID-19 compared with not hospitalized COVID-19 (weighted median: P = 0.28, simple median: P = 0.59, MR-PRESSO: P = 0.50 and MR-RAPS: P = 0.48, Table 1; Figs. 2, 3a–d), hospitalized COVID-19 compared with the general population (weighted median: P = 0.69, simple median: P = 0.56, MR-PRESSO: P = 0.62 and MR-RAPS: P = 0.57, Table 1; Figs. 2, 3a–d), and severe COVID-19 compared with the general population (weighted median: P = 0.37, simple median: P = 0.66, MR-PRESSO: P = 0.38 and MR-RAPS: P = 0.35, Table 1; Figs. 2, 3a–d). The multiple sensitivity analysis validated the correctness and robustness of the main analytical approaches (IVW and GRS).

3.4.2 Horizontal Pleiotropy and Heterogeneity

Based on the Cochran’ Q test and MR-Egger regression test, there was no directional pleiotropy or heterogeneity for the associations between of alcohol consumption with COVID-19 (MR-Egger intercept P = 0.24, Cochran’s Q P = 0.30, Table 3), hospitalized COVID-19 compared with not hospitalized COVID-19 (MR-Egger intercept P = 0.56, Cochran’s Q P = 0.24, Table 3), and severe COVID-19 compared with the general population (MR-Egger intercept P = 0.98, Cochran’s Q P = 0.17, Table 3). However, there was slight heterogeneity for hospitalized COVID-19 compared with the general population, but not occurrence of horizontal pleiotropy (MR-Egger intercept P = 0.98, Cochran’s Q P = 0.02, Table 3), suggesting that the fixed-effects IVW might be inaccurate estimate for the association. We, therefore, used the random-effects IVW model instead of fixed-effects IVW model for analysis (Table 1).

Table 3 Potential pleiotropy evaluation using different methods

4 Discussion

Considering the advantage of MR method, we conducted a two-sample MR study from a prospective cohort of UK Biobank. In the present study, no IVW evidence was indicated between alcohol consumption and the risk of COVID-19, hospitalized COVID-19, and severe COVID-19. Furthermore, GRS analysis validation was also not observed positive relationship between drinking alcohol and COVID-19, hospitalized COVID-19, and severe COVID-19 risk, respectively. What is more, multiple alternative sensitivity analysis demonstrated the consistent characterization with the IVW and GRS methods, suggesting the reliable of our measurements and results in European population.

Among modifiable environment risk factors, the observational studies focused on the relationship between alcohol use and COVID-19 yielded mixed results [31]. Lassen et al. found that weekly alcohol consumption was associated with an increased risk of developing acute respiratory distress syndrome during hospitalized COVID-19 patients. However, Dai et al. did not detect the similar effect on the severity and mortality of COVID-19 [32]. More interestingly, a recent study identified that obesity (BMI ≥ 34) was a major factor increased the mortality of SARS-COV-2 on the account of alcohol, indicating the superimposed effects of obesity and alcohol use. The mechanistic hypothesis proposed that the combination of alcohol consumption and obesity cause lower immunity and impose individual prone to develop cytokine storm and acute respiratory distress syndrome; the hallmark of COVID-19 mortality. Therefore, the confounder interference may be difficult to avoid in typical epidemiological investigation that results in misinterpretation or reverse causal connection.

The MR analysis is a useful approach which could eliminate confounding factors and accurately determine the truly relationship between exposure and outcome [22]. In 2021, a three genetic variants score that included rs1229984 (Alcohol Dehydrogenase 1B, ADH1B), rs1260326 (Glucokinase Regulator, GCKR), and rs13107325 (Solute Carrier Family 39 Member 8, SLC39A8) as the instrumental variables, was generated to assess the effect of alcohol consumption on the clinical outcomes in patients with COVID-19. Consistent with previous observational study, the MR analysis found that frequent drinking, especially heavy drinking, was associated with higher risk of death in COVID-19 patients with obesity, but not in patients without obesity, suggesting potential interactions of obesity on the relationship between alcohol consumption and the severe COVID-19. Later, Rao et al. employed a two-sample MR analysis to evaluate the possible associations between drinking behavior and three COVID-19 outcomes: COVID-19 infection, hospitalized COVID-19 and severe COVID-19. No significant links were found between alcohol consumption status and COVID-19 outcomes. Similar to the above reports, our findings further confirm and expand the knowledge that alcohol consumption is not susceptible to SARS-CoV-2 infection regardless of diversity outcomes.

Nowadays, alcohol is consumed around 40% of the world, which will paint a clear and irrefutable picture of the long-term health problem. Existing evidence reveal that alcohol could be directly absorbed through the bronchial circulation and be vaporized toward the trachea. The localized high concentration should eventually lead to attenuate the host defenses of airway epithelium [33]. Not surprisingly, chronic alcohol use has been identified to be associated with significant alterations of the immune system predisposing to viral and bacterial infections through alveolar epithelium dysfunction, oxidative stress, and abnormalities in macrophages [34]. Due to the dysregulation of immune host defense in the airways, it is likely that acute and chronic alcohol abuse contribute to vulnerability to SARS-CoV-2 infection. Unfortunately, no causality of alcohol on COVID-19 susceptibility was observed in the current study. This may reflect that the virus is highly infectious and that the potential effects of alcohol may be negligible in the context of the virus being overwhelming.

The strengths of our study include the large sample size with 38,984 COVID-19 cases comprising 80 independent alcohol-associated genetic variants as instrumental variables, which increase the statistical power of our study. In addition, the estimates from the IVW and GRS methods in different COVID-19 groups were consistent, indicating the robustness of the associations. Furthermore, the findings across various sensitivity analyses suggest that potential pleiotropies were not biasing the linkage. Meanwhile, some limitations also need to be considered. First, the COVID-19 data were available from the UK Biobank that may not accurately reflect the entire population. Second, our study only includes individuals of European ancestry, so further studies are warranted to evaluate in other races.

In conclusion, our study highlights that alcohol use is not causally associated with the risk of COVID-19, irrespective of hospitalized and severity patients. These analyses strengthen and expand the understanding of COVID-19 pathophysiology and clinical application for COVID-19 prevention.