FormalPara Key Summary Points

Why carry out this study?

A large number of observational studies have described the links between tobacco smoking, alcohol consumption and the risk of infectious diseases. However, it is difficult to determine their causalities because these observational results are inevitably affected by potential confounding effects and reverse causation.

Mendelian randomization (MR) is a useful tool that could provide unconfounded effect estimates and overcome the limitations of observational studies by utilizing genetic variants as instrumental variables.

The aim of this study was to employ MR methods to determine the potential causal effects of smoking and alcohol use on the risk of common infectious diseases, including sepsis, pneumonia, upper respiratory tract infection (URTI) and urinary tract infection (UTI).

What was learned from the study?

To the best of our knowledge, this is the first large-scale MR analysis to investigate the causal effects between smoking, alcohol use and risk of infectious diseases.

Genetically predicted smoking initiation, cigarettes per day and lifetime smoking were associated with an increased risk of sepsis and pneumonia. However, no evidence was found to support a causal effect of genetically predicted alcohol use on the risk of infectious diseases.

Our findings provide a better understanding of the role of smoking and alcohol use in infectious diseases, and indicate that tobacco smoking may be an independent risk factor for infections.

Introduction

Infections are a major cause of morbidity and mortality worldwide, affecting approximately one-fourteenth of the global population between 2009 and 2013, and thus contributing to the global disease burden [1]. In their most serious presentations, infections can progress rapidly into sepsis, multi-organ failure and even death [2]. Therefore, it is necessary to identify potential risk factors for infections and thus improve global public health.

Tobacco and alcohol consumption are among the most important public health concerns. A recent survey concluded that approximately 18.4% of the adult population had at least one occasion of heavy episodic alcohol use in the past month, and nearly one in seven adults engage in daily tobacco smoking [3]. A large number of observational studies have suggested that tobacco smoking and alcohol use are associated with an increased risk of a variety of infectious diseases, including sepsis, septic shock and acute respiratory distress syndrome (ARDS) 4,5,6,7,8,9]. However, the causal direction between smoking and alcohol use and the risk of infections remains uncertain because these observational results are inevitably affected by other potential confounding effects and reverse causation.

Mendelian randomization (MR) is a popular tool that can provide unconfounded effect estimates and overcome the limitations of observational studies by utilizing genetic variants as instrumental variables. Given that the rationale behind the design of MR is that the genetic variants are assigned randomly at conception, the result of MR is less likely to be affected by confounding or reverse causation than conventional observational studies. Additionally, a recent study has successfully employed the MR method to investigate the causal effects between lifetime smoking and schizophrenia and depression [10]. To our knowledge, up to now, there has been a lack of evidence to support the causal effects between smoking, alcohol use and risk of infectious diseases.

In the current study, we aimed to employ MR methods to determine the potential causal effects of smoking and alcohol use on the risk of four common infectious diseases: sepsis, pneumonia, upper respiratory tract infection (URTI) and urinary tract infection (UTI). We hypothesized that there would be a directional causal effect between tobacco smoking, alcohol use and risk of infectious diseases.

Methods

Study Design

We conducted MR analysis based on the publicly available summary-level data from genome-wide relationship studies (GWASs) to evaluate the causal relationships between smoking, alcohol use and risk of infectious diseases. In order to perform MR analysis, the following critical assumptions were made in this study: (1) instrumental variables were strongly associated with exposure; (2) instrumental variables were independent of confounders of exposure and outcome; and (3) instrumental variables affected outcome only via exposure [11].

All original studies obtained ethical approval and informed consent from the participants. The data for this study were anonymous and available in the public domain, and therefore the requirement for ethical approval and informed consent in this study was waived. This study was conducted in accordance with the ethical principles of the 2013 Declaration of Helsinki [12]. Meanwhile, the results of this study were reported in adherence to the Strengthening the Reporting of Observational Studies in Epidemiology Using Mendelian Randomization (STROBE-MR) guidance from 2021 [13].

GWAS Summary Data for Tobacco and Alcohol Use

Genetic association estimates of single nucleotide polymorphisms (SNPs) with smoking and alcohol use were obtained from the GWAS and Sequencing Consortium of Alcohol and Nicotine Use (GSCAN) and the UK Biobank [10, 14]. Four smoking phenotypes and one alcohol-related phenotype were included in our study: smoking initiation [including the age of initiation of regular smoking (AgeSmk, N = 341,427) and a binary phenotype indicating whether an individual had ever smoked regularly (SmkInit, N = 1,232,091)], cigarettes per day (CigDay, N = 337,334), lifetime smoking (LifSmk, N = 462,690) and drinks per week (DrnkWk, N = 941,280). The definition of each phenotype is listed in the Supplementary Note.

GWAS Summary Data for Infectious Diseases

Genetic determinants of sepsis [ieu-b-4980, N = 486,484 (including 11,643 cases and 474,841 control subjects)], pneumonia [ieu-b-4976, N = 486,484 (including 22,567 cases and 463,917 control subjects)], URTI [ieu-b-5063, N = 486,484 (including 2795 cases and 483,689 control subjects)] and UTI [ieu-b-5065, N = 486,214 (including 21,958 cases and 464,256 control subjects)] were obtained from summary-level GWAS results in the UK Biobank, publicly available in the Integrative Epidemiologic Unit (IEU) GWAS database (https://gwas.mrcieu.ac.uk/). In brief, the UK Biobank is a large, prospective cohort study with more than 500,000 participants from the United Kingdom [15]. Table 1 summarizes the data sources, years, population, gender and sample size in these GWASs.

Table 1 Characteristics of the data used in this study

Selection of Genetic Instruments

Following the procedure of Chen et al. [16], for this study, we selected eligible genetic instrumental variables as follows: (1) the SNPs associated with each exposure (AgeSmk, SmkInit, CigDay, LifSmk and DrnkWk) reached a genome-wide significance threshold (P < 5 × 10–8); (2) to avoid linkage disequilibrium, we performed the clumping procedure with R2 < 0.001 and a clumping window  of > 10,000 kb; (3) we excluded SNPs that were significantly associated with outcome (P < 5 × 10–8); and (4) we included SNPs with F statistics > 10, which indicated that the genetic variants had relatively strong estimated effects. Furthermore, the F statistics for each SNP were calculated by the following equation: F = R2 × (N − 2)/(1 − R2). The main information for the SNPs, including effect allele, other allele, beta, standard error, and P value, were collected systematically for further analysis.

Mendelian Randomization Analysis

In the present study, we employed five complementary approaches to estimate the causal effects of tobacco smoking and alcohol use on the risk of infectious diseases, including the random-effects inverse variance weighted (IVW), MR-Egger regression, weighted median, simple mode and weighted mode methods. Among the five methods, IVW was used as the main analysis to evaluate the causal associations between exposures and outcomes, because it is the most widely used of the methods in MR studies and could provide the most precise results when all selected SNPs were valid instrumental variables. The other four approaches were utilized as additional methods for MR analysis.

Education and obesity were recently identified as major confounding factors of the smoking–infections relationship [17, 18]. Consequently, we performed multivariable MR analysis using the multivariable IVW method to adjust for education and body mass index (BMI) when assessing the independent causal effects of the tobacco smoking and alcohol use on the risk of infectious diseases.

Sensitivity Analyses

In this study, sensitivity analyses comprised tests for heterogeneity and genetic pleiotropy, leave-one-out analysis and a funnel plot. First, to estimate the heterogeneity of the IVW approach, we calculated the Cochran’s Q statistic; the P value of Cochran’s Q test was used to test for heterogeneity (a P value of less than 0.05 indicated heterogeneity). Second, to estimate the genetic pleiotropy, we employed the intercept from MR-Egger regression to examine the horizontal pleiotropy (a P value of less than 0.05 indicated horizontal pleiotropy). Third, we conducted leave-one-out analysis by removing each SNP and testing the remaining SNPs; this could be used to detect outliers. Fourth, we applied a funnel plot to conduct a visual inspection for asymmetry, which may be an indication of violations of the MR assumption via horizontal pleiotropy.

Statistical Analysis

Causal estimates were displayed as an odds ratio (OR) and 95% confidence interval (CI). We used the Bonferroni method to correct multiple testing, and a P value of less than 0.0025 (α = 0.05/20 outcomes) was considered statistically significant whilst a P value of less than 0.05 was regarded as nominally significant. The scatter plot, leave-one-out plot, funnel plot and all statistical analysis performed in this study were conducted using the “TwoSampleMR” package (https://mrcieu.github.io/TwoSampleMR/) and the “MendelianRandomization” package (https://cran.r-project.org/package=MendelianRandomization) in R (version 3.6.1® Project for Statistical Computing, Vienna, Austria).

Results

Characteristics of the Genetic Instruments

Following the instrument selection steps, nine index SNPs were selected to genetically predict AgeSmk, 154 index SNPs were used to genetically predict SmkInit, 32 index SNPs were chosen to genetically predict CigDay, 108 index SNPs were selected to genetically predict LifSmk, and 72 index SNPs were chosen to genetically predict DrnkWk. The F statistics for these genetic instruments were all over 10, suggesting that no weak instruments were employed. An overview of the instrumental variables included in each MR analysis is provided in Supplementary Tables S1–S5.

Univariable Mendelian Randomization Analysis

The MR analysis estimates from different methods for the causal effects of tobacco smoking and alcohol use on risk of infectious diseases are presented in Figs. 1, 2, 3, 4, and 5. More specifically, genetically predicted SmkInit was suggestively associated with a higher risk of sepsis (OR 1.353, 95% CI 1.079–1.696, P = 0.009), pneumonia (OR 1.770, 95% CI 1.464–2.141, P = 3.8 × 10−9) and UTI (OR 1.445, 95% CI 1.184–1.764, P = 3 × 10−4), but not with a higher risk of URTI (OR 1.544, 95% CI 0.889–2.683, P = 0.123). In addition, genetically predicted CigDay was associated with a higher risk of sepsis (OR 1.403, 95% CI 1.037–1.898, P = 0.028) and pneumonia (OR 1.501, 95% CI 1.167–1.930, P = 0.00156), but not with a higher risk of URTI (OR 0.774, 95% CI 0.418–1.432, P = 0.414) and UTI (OR 1.207, 95% CI 0.901–1.617, P = 0.207). Moreover, genetically predicted LifSmk was associated with an increased risk of sepsis (OR 2.200, 95% CI 1.583–3.057, P = 2.63 × 10−6), pneumonia (OR 3.462, 95% CI 2.798–4.285, P = 3.28 × 10−30), URTI (OR 2.523, 95% CI 1.315–4.841, P = 0.005) and UTI (OR 2.036, 95% CI 1.585–2.616, P = 3.0 × 10−8). Furthermore, no evidence was found to support causal effects between AgeSmk, DrnkWk and the risk of sepsis (OR 0.700, 95% CI 0.381–1.284, P = 0.249; OR 0.994, 95% CI 0.752–1.315, P = 0.969; respectively), pneumonia (OR 0.632, 95% CI 0.388–1.063, P = 0.086; OR 0.961, 95% CI 0.760–1.216, P = 0.739; respectively), URTI (OR 0.894, 95% CI 0.261–3.068, P = 0.859; OR 0.940, 95% CI 0.541–1.633, P = 0.827; respectively) and UTI (OR 0.665, 95% CI 0.373–1.186, P = 0.167; OR 1.011, 95% CI 0.802–1.274, P = 0.926; respectively). The scatter plots for univariable MR analyses are presented in Supplementary Figs. S1–S5.

Fig. 1
figure 1

Univariable MR estimates for the causal effects of AgeSmk on the risk of infectious diseases. MR Mendelian randomization, AgeSmk age of initiation of regular smoking, URTI upper respiratory tract infections, UTI urinary tract infections, SNP single nucleotide polymorphism, OR odds ratio

Fig. 2
figure 2

Univariable MR estimates for causal effects of SmkInit on the risk of infectious diseases. MR Mendelian randomization, SmkInit smoking initiation, URTI upper respiratory tract infections, UTI urinary tract infections, SNP single nucleotide polymorphism, OR odds ratio

Fig. 3
figure 3

Univariable MR estimates for causal effects of CigDay on the risk of infectious diseases. MR Mendelian randomization, CigDay cigarettes per day, URTI upper respiratory tract infections, UTI urinary tract infections, SNP single nucleotide polymorphism, OR odds ratio

Fig. 4
figure 4

Univariable MR estimates for causal effects of LifSmk on the risk of infectious diseases. MR Mendelian randomization, LifSmk lifetime smoking, URTI upper respiratory tract infections, UTI urinary tract infections, SNP single nucleotide polymorphism, OR odds ratio

Fig. 5
figure 5

Univariable MR estimates for causal effects of DrnkWk on the risk of infectious diseases. MR Mendelian randomization, DrnkWk drinks per week, URTI upper respiratory tract infections, UTI urinary tract infections, SNP single nucleotide polymorphism, OR odds ratio

Multivariable Mendelian Randomization Analysis

The results of the multivariate MR analysis after adjusting for education and BMI are shown in Supplementary Table S7. The following significantly causal relationships were identified: SmkInit and sepsis pneumonia (OR 1.328, 95% CI 1.083–1.657, P = 0.001), SmkInit and pneumonia (OR 1.731, 95% CI 1.429–2.084, P = 6.7 × 10−8), SmkInit and UTI (OR 1.329, 95% CI 1.170–1.749, P = 4.5 × 10−4), CigDay and sepsis (OR 1.518, 95% CI 1.089–2.643, P = 0.001), CigDay and pneumonia (OR 1.537, 95% CI 1.204–2.197, P = 1.8 × 10−5), LifSmk and sepsis (OR 2.055, 95% CI 1.438–2.741, P = 3.7 × 10−5), LifSmk and pneumonia (OR 3.207, 95% CI 2.481–4.087, P = 9.8 × 10−26), LifSmk and URTI (OR 2.251, 95% CI 1.362–4.199, P = 0.002), LifSmk and UTI (OR 1.948, 95% CI 1.406–2.330, P = 7.5 × 10−7).

Sensitivity Analyses

To assess the robustness of our findings, a series of sensitivity analyses were conducted, including Cochran’s Q test, the MR-Egger intercept test, leave-one-out analysis and a funnel plot. Supplementary Table S6 displays the results of the MR-Egger intercept test and Cochran’s Q test. No horizontal pleiotropy existed between instrumental variables and outcomes (all P values of the MR-Egger intercept tests were more than 0.05). However, heterogeneity was observed in the Cochran’s Q test analysis between SmkInit and risk of pneumonia (Q = 134.52, P = 0.012), SmkInit and risk of URTI (Q = 143.85, P = 0.003), SmkInit and risk of UTI (Q = 143.59, P = 0.003), CigDay and risk of pneumonia (Q = 46.16, P = 0.039), CigDay and risk of UTI (Q = 60.54, P = 0.001), LifSmk and risk of sepsis (Q = 129.98, P = 0.017), LifSmk and risk of URTI (Q = 122.64, P = 0.047), LifSmk and risk of UTI (Q = 139.99, P = 0.004), DrnkWk and risk of pneumonia (Q = 105.29, P = 0.004), and DrnkWk and risk of UTI (Q = 98.80, P = 0.013). Although heterogeneity was detected in the above results, it did not invalidate the MR estimates because we used the random-effect IVW method as the primary analysis, which can balance the pooled heterogeneity. Additionally, leave-one-out analysis indicated that the causal estimates of tobacco smoking, alcohol use and the risk of infectious diseases were not driven by any single SNP (Supplementary Figs. S6–S10). Lastly, the funnel plots for MR analysis showed that the data points were equally distributed around the funnel, indicating that no substantial asymmetry existed (Supplementary Figs. S11–S15).

Discussion

To the best of our knowledge, this is the first large-scale MR analysis to investigate the causal effects between smoking, alcohol use and risk of infectious diseases. Our study indicated that genetically predicted SmkInit, CigDay and LifSmk were associated with an increased risk of sepsis and pneumonia. Furthermore, no evidence was found to support an association between alcohol use and the risk of infectious diseases. Taken together, our findings provided a better understanding of the role of tobacco smoking and alcohol use in infectious diseases, indicating that reducing the number of cigarettes smoked may have beneficial health effects.

Tobacco smoking and alcohol consumption are the most common co-abused drugs globally. There are a large number of observational studies that describe the links between tobacco smoking, alcohol consumption and the risk of various infectious diseases [6, 7, 6,7,19,20,21,22]. For instance, a prospective cohort study conducted by Calfee et al. [6] demonstrated that cigarette smoking was associated with an increased risk of developing ARDS in sepsis, independent of other ARDS predictors, including alcohol abuse, diabetes, and severity of illness. Another retrospective cohort study including 11,651 adult admissions reported that alcohol dependence is independently associated with sepsis (12.9% vs. 7.6%, P < 0.001) and septic shock (3.6% vs. 2.1%, P = 0.001) [19]. Additionally, the latest meta-analysis, which included 17 observational studies, also reported that high alcohol consumption increases the risk of ARDS [23]. Despite the large amount of evidence from observational studies showing that tobacco smoking and alcohol consumption are associated with increased risk of infectious diseases, it is difficult to determine their causalities.

Different from conventional observational studies, MR analysis can prevent the possible effects of confounding factors and reverse causality by applying genetic variation to assess causal associations with outcomes. This technique utilizes available GWAS studies to screen for candidate genetic instrumental variables to use as robust proxies for modifiable exposures. Notably, the genetic variables are not associated with confounders nor subject to reverse causation because they are randomly distributed at the time of gametogenesis. Therefore, MR analysis may be considered conceptually equivalent to randomized controlled trials, which are less susceptible to confounding or reverse causality than traditional observational studies. Taken together, MR is a powerful and effective tool to evaluate the causal relationship between variable and outcome.

Even the underlying mechanism for the influence of tobacco smoking on the risk of sepsis and pneumonia is not fully understood. There are several possible reasons that may explain these causalities. (1) Tobacco smoking increases the levels of pro-inflammatory cytokines such as tumor necrosis factor alpha and interleukin 6, which may result in a higher risk of infectious diseases [24]. (2) Potentially toxic substances in tobacco smoking can damage the vascular endothelial cells, thus increasing susceptibility to many infectious diseases [5]. (3) Smokers are usually associated with worse health habits, such as less vaccination uptake [5]. Other possible explanations for the association between alcohol consumption and risk of infectious diseases could be glutathione depletion, Toll-like receptor upregulation and impairment of macrophage function [23, 25, 26].

Our findings have important implications for tobacco product regulation. In traditional economic models, the causal relationship between tobacco smoking and the risk of infectious diseases is not well represented. Additionally, the inclusion of immediate changes in disease burden may have a substantial effect on the discounted present value of reductions in smoking [6]. The findings in this study also indicate that clinicians may monitor the risk of infectious diseases in smoking patients because they bear the burden of increased risk of a variety of infectious diseases due to tobacco smoking.

The current study has several strengths. First, the sample size was large, allowing us to gain more precise estimates and detect slight statistical differences. Second, univariable and multivariable MR methods were applied to explore the causal associations, and these methods tend to be less biased than conventional observational studies. Third, the MR method took advantage of GWAS summary data on tobacco smoking, alcohol consumption and risk of infectious diseases that were derived from two independent populations. Fourth, we used multiple tobacco smoking variables, including AgeSmk, SmkInit, CigDay and LifSmk, which enabled us to evaluate various dimensions of tobacco smoking and identify possible causalities of tobacco smoking and risk of infectious diseases.

However, there are several limitations of this study. First, our findings rested on data from GWASs that were only performed in individuals of European ancestry, with a lack of ancestral and cultural diversity. Therefore, it is uncertain whether these results could be generalized to other ethnic groups. However, the uniformity of the participants ensures a minimal risk of confounding by population admixture. Second, we detected heterogeneity in certain results. However, the existence of heterogeneity did not invalidate the MR estimates because the random-effect IVW was used as the primary analysis in this study, which can balance the pooled heterogeneity. Third, we only provided evidence from genetic levels; there was a lack of additional mediator analysis and observation studies to further confirm the specific regulation mechanisms involved in the causal effects between tobacco smoking, alcohol consumption and the risk of infectious diseases. Thus, further studies are need to to confirm our findings.

Conclusion

In this MR study, we demonstrated the causal associations between tobacco smoking and risk of infectious diseases. However, no evidence was found to support the causality between alcohol use and the risk of infectious diseases.