Introduction

Idiopathic pulmonary fibrosis (IPF) is a progressive interstitial lung disease (ILD) of unknown origin and has a poor prognosis [1]. Epidemiological studies reveal that the global incidence and prevalence of IPF are increasing annually, and the median survival duration following an IPF diagnosis is within the range of 3 to 5 years, with a five-year survival rate of under 30% [2, 3].

Cigarette smoking (CS) is a lifestyle factor that can potentially be modified, which consistently ranks as one of the primary risk factors for IPF. Therefore, CS have garnered significant attention as promising areas of intervention in efforts to prevent the risk of IPF and halt its progression [4]. However, research on the specific role of smoking in driving disease progression in individuals with IPF is still limited. diffusing capacity of the lung for carbon monoxide (DLCO) is an important indicator used to assess disease progression in patients with IPF and reflects the ability of the lungs of IPF patients to transfer gas [5, 6]. Previous observational studies have shown that both current smoking status and increasing pack-years of CS were linked to lower DLCO. This implies that individuals with a history of CS, particularly those who have smoked for an extended period or at a higher intensity, are more susceptible to a decline in DLCO [7,8,9,10]. In summary, smoking appears to be a contributing factor to decreased DLCO in patients diagnosed with IPF. However, the determination of smoking as a causal factor of decreased DLCO remains uncertain, given that these studies are primarily observational. These studies are susceptible to confounding bias and reverse causality, which can complicate the interpretation of the relationship between smoking and DLCO in patients with IPF.

Mendelian randomization (MR) is a new approach that addresses these above challenges by using genetic variants as a reliable tool for establishing causal relationships; this approach is less vulnerable to confounding bias than is conventional observational studies [11]. In this context, we conducted an MR study to explore the potential causal link between smoking and a decrease in DLCO.

In summary, our research aimed to investigate the causal relationship between smoking and DLCO in individuals with IPF via a two-sample MR approach. The research based on summary data of large-scale genome-wide association study (GWAS).

Methods

Study design

To evaluate the causal link between smoking and DLCO, we executed a two-sample MR analysis. The reliability of instrumental variables (IVs) hinges upon the fulfillment of three fundamental assumptions [12]. First, the genetic variants employed as IVs must be significantly associated with the targeted exposure. Second, these IVs should not be linked to any confounders. Finally, these IVs affect the outcome via alternative pathways (Fig. 1).

Fig. 1
figure 1

Workflow and the three assumptions for Mendelian randomization analysis

GWAS data sources

The primary metric for quantifying smoking behaviour was the lifetime smoking index (LSI), which was ascertained through a GWAS carried out in the UK Biobank; this study included 462,690 individuals of European descent, as reported by Wootton et al. [13]. The construction of the LSI involved the utilization of self-report questionnaire data regarding smoking intensity, duration, and initiation, following the methodology outlined by Leffondre et al [14]. This approach aimed to provide a more comprehensive representation of smoking habits. The study identified 124 genetic markers associated with the LSI, all of which reached genome-wide significance (P < 5 × 10− 8) and exhibited minimal linkage disequilibrium (LD) (r2 < 0.001).

The GWAS summary data for DLCO were derived from The Collaborative Group of Genetic Studies of IPF [15]. This comprehensive analysis of 3 cohorts (US, UK, and UUS) included the genotype data of 975 individuals diagnosed with IPF. The patients were diagnosed in accordance with the guidelines established by the American Thoracic Society and the European Respiratory Society.

Selection of genetic instruments

To ensure the reliability of the IVs used for MR analyses, we adhered to the following criteria. First, we chose single nucleotide polymorphisms (SNPs) associated with smoking-related traits, and the threshold value was P < 5 × 10− 8. Second, to prevent any LD among all IVs for IPF, we set the clumping parameter to R2 < 0.001 and a window size of 10 Mb. Third, during the harmonization process, we removed palindromic SNPs from the IV. Fourth, to mitigate the risk of bias stemming from weak IVs, we calculated the F-statistic \( (F={beta}^{2}/{se}^{2})\) [16]. If the F-statistic for IVs greatly exceeded 10, the likelihood of bias from weak IVs was minimal [17]. Moreover, all GWAS data utilized in our MR analyses were limited to individuals of European descent to exclude potential biases from population heterogeneity.

Mendelian randomization analyses

In our MR analysis, we chose the inverse variance weighted (IVW) method, which combines the Wald ratio for each SNP, as the primary approach, leading to a consolidated causal estimate [18, 19]. To ensure the robustness of our analysis and account for potential pleiotropy, we also conducted sensitivity analyses using several complementary methods. These methods included the weighted mode [20], MR‒Egger [21], weighted median [22], simple mode [23], and MR-Pleiotropy RESidual Sum and Outlier (MR-PRESSO) [24].

Sensitivity analysis

To identify potential pleiotropy and assess the robustness of our results, we conducted several analyses, including Cochran’s Q statistic [25] and MR‒Egger intercept tests [21]. Specifically, heterogeneity was indicated if the P value of the Cochran Q test was less than 0.05. We also assessed horizontal pleiotropy based on the intercept term derived from MR‒Egger regression. In addition, to ascertain whether any single SNP drove the causal estimate, we performed leave-one-out analysis [26].

MR analysis was performed using RStudio (version 4.2.1) with the TwoSampleMR (version 0.5.6) and MRPRESSO (version 1.0) R packages. A significance level of P < 0.05 was used to determine statistical significance.

Ethics

Summary data were used, and ethical approval was not needed.

Results

MR estimate

We identified 119 SNPs (Additional file 1) as IVs to investigate the genetic relationship between LSI and DLCO. The F-statistic for each SNP exceeding 30 indicated a low probability of a weak IV. Subsequently, we conducted an MR analysis utilizing these 119 SNPs. The results obtained through the IVW method revealed a causal link between the LSI and DLCO (ORIVW = 0.54, 95% CI 0.32–0.93; P = 0.02; Fig. 2). Furthermore, (ORMR−Egger = 0.09, 95% CI 0.01–0.73, P = 0.03, Fig. 2; (ORWeighted median = 0.41, 95% CI 0.18–0.90, P = 0.03, Fig. 2); (ORSimple mode = 0.23, 95% CI 0.03–2.02, P = 0.19, Fig. 2); (ORWeighted mode = 0.25, 95% CI 0.05–1.26, P = 0.1, Fig. 2); and (ORMR−PRESSO = 0.54, 95% CI 0.32–0.93, P = 0.03, Fig. 2) had consistent directions of effects across all six methods. As illustrated in the scatter plot (Fig. 3A), there was a noticeable decrease in DLCO as the LSI increased.

Fig. 2
figure 2

Causal effect of the lifetime smoking index on DLCO

Sensitivity analyses

We subsequently conducted sensitivity analyses to assess the robustness of our results. First, Cochran’s Q test demonstrated the absence of heterogeneity among the IVs (PMR−Egger = 0.374, PIVW = 0.325; Table 1). The absence of heterogeneity was also confirmed by the symmetry of the funnel plot (Fig. 3B). Second, there was no indication of overall horizontal pleiotropy across all IVs, as evidenced by the results of both the MR‒Egger regression (P = 0.085, Table 1) and the MR-PRESSO global test (P = 0.329, Table 1). These results imply that IVs are unlikely to exert their influence on the decrease in DLCO through pathways unrelated to smoking. In the leave-one-out sensitivity analysis, where we systematically excluded one SNP at a time, the results revealed that no specific SNP exerted a significant influence on the DLCO (Additional file 2). As a result, our results remained robust and exhibited no substantial bias.

Fig. 3
figure 3

Scatter plot (A) of the effect of the lifetime smoking index on DLCO and funnel plot (B)

Table 1 Pleiotropy and heterogeneity tests

Discussion

IPF patients have a median survival of 3–5 years after diagnosis but a highly variable clinical course [27]. Lung function in patients with IPF may decline precipitously from the onset of the disease or slowly over the course of the disease, during which acute exacerbations (AEs) occur that can lead to respiratory failure and early death [28]. Therefore, further research into the factors associated with the progression of IPF has become essential, as these factors can enhance the prevention of this condition and decelerate the progression of IPF.

Pulmonary function tests are essential for detecting, diagnosing, and monitoring the progression of IPF. However, given the infancy of computed tomography biomarkers, estimates of disease severity and risk stratification in IPF are still based almost exclusively on functional and physiologic indices, such as forceful lung volume (FVC), diffusing capacity for carbon monoxide (DLCO), and the 6-minute walk test (6MWT), with DLCO considered one of the most valuable parameters for monitoring the progression of IPF. DLCO is considered one of the most valuable pulmonary function test parameters for monitoring the progression of IPF [29,30,31].. To the best of our knowledge, this is the first study to determine the causal links between smoking and DLCO in patients with IPF based on the MR framework. Our approach drew upon large-scale GWAS data, allowing us to analyse a substantially larger number of cases than did previous observational studies. As expected, our study showed that smoking leads to negative effects on DLCO, which is largely in line with the findings of previous research [7,8,9,10]. While the link between smoking and IPF has been established in previous observational studies, our MR analysis offers robust evidence that aligns with the possibility of a causal connection, which is less vulnerable to confounding bias. Nevertheless, because we utilized summary-level data, we were unable to delve into sex-specific associations, indicating the need for future investigations in this area.

The specific mechanisms through which smoking exacerbates a decrease in DLCO have not been identified. Chronic lung inflammation and oxidative stress may be potential pathways mediating the relationship between smoking and reduced DLCO levels. Smoking harms the lungs by inciting chronic inflammation and oxidative stress, thus worsening the progression of IPF [32,33,34,35]. It leads to persistent inflammation, disrupts the balance of oxidation, contributes to the buildup of extracellular matrix in the lungs, impairs lung function, hampers gas exchange, and accelerates the deterioration of IPF [32, 36]. Exposure to CS or its extract (CSE) results in the senescence of alveolar epithelial type 2 (AT2) cells, a pivotal process in the progression of lung fibrosis [37]. Several mechanisms drive the CS-induced senescence of AT2 cells, including decreased autophagy, deactivation of the SIRT1 protein, DNA damage, and heightened oxidative stress. In addition, there is growing evidence of a potential correlation between smoking and a variety of IPF prognostic factors (such as MMP-7, SP-A, SP-D, GDF15, and CA-125). For example, higher levels of LOXL2 are associated with poor progression in IPF patients, and there is evidence that LOXL2 is significantly upregulated in patients who smoke. Moreover, SP-D, a serum marker, was found to be higher in smoking patients compared to non-smoking patients [31, 38, 39]. All these imply a potential relationship between CS and multiple IPF prognostic factors. Overall, CS is pivotal in additional damage to the lungs [40]. Concerning its public health implications, our discoveries lend support to the notion that smoking cessation initiatives can serve as an efficacious strategy for mitigating the decrease in DLCO and the ensuing adverse consequences.

Our study offers several notable advantages. First, an inaugural MR investigation was performed to evaluate the causal relationship between elevated smoking and decreased IPF. Second, the robustness of the analysis results was ensured by various sensitivity analysis methods. The study’s limitations must be acknowledged. First, our findings predominantly pertain to participants of European ancestry, and their applicability to populations of different racial backgrounds may be limited. Second, despite the absence of horizontal pleiotropy in our analysis, there may be residual bias due to limited knowledge about the precise functions of most of these SNPs. Third, as our study relied on GWAS summary data instead of individual-level data, it was not possible to stratify our analysis based on other variables, such as age and sex.

Conclusion

Our study suggested that smoking is an important factor for DLCO decline in IPF patients, which may provide new insights into the progression of IPF. Considering the imperative of delaying disease progression, significant emphasis should be placed on lifestyle management, including smoking cessation as a relevant strategy.