Background

Lung cancer is one of the most common cancers and a leading cause of cancer-related death worldwide [1, 2]. In 2018, there were 2.09 million new cases and 1.76 million deaths of lung cancer worldwide, accounting for 11.6% and 18.4% of all cancer cases and deaths, respectively [3]. In particular, non-small cell lung cancer (NSCLC), the most common type of lung cancer, accounts for approximately 85% of all lung cancer cases [4]. Due to the increasing burden of NSCLC, it is necessary to identify more potential risk factors associated with NSCLC so as to develop individualized prevention strategies.

Obesity, usually defined as body mass index (BMI) ≥ 30 kg/m2, is becoming an increasingly common global health problem [5]. The global prevalence of obesity in adults increased steadily between 1975 and 2016, from 3 to 11% in men and 6 to 15% in women [6]. Several epidemiological studies have demonstrated that a higher BMI is associated with a lower risk of NSCLC in European and Asian populations [2, 7], which was also confirmed by a recent meta-analyses with a sample size of 7,310,130 participants [8]. However, most of these studies only used BMI at a single time point instead of considering the role of longitudinal BMI trajectories across the life course. In addition, a number of studies have shown that BMI trajectory from normal weight to obesity was associated with the risk of multiple cancers, including prostate, colorectal, oesophageal, gastric cardia adenocarcinoma, and even lung cancer [9,10,11].

Although environmental risk factors (e.g. BMI) are the main risk factors for NSCLC [12], genetic susceptibility is also an important contributor [13]. The heritability of lung cancer in European and Asian populations is estimated to be 12–21% [14, 15]. Previous genome-wide association studies (GWAS) identified more than 80 susceptibility variants associated with lung cancer in European and Asian populations, mainly NSCLC, as it is the main type of lung cancer; however, these variants could only explain a small proportion of the overall genetic variance [16, 17]. Interestingly, there is accumulating evidence that gene-environment interactions may be responsible for the missing heritability of cancer and act together with environmental risk factors in the pathogenesis of cancer [18, 19].

However, it remained unclear whether there was evidence to support the joint association between BMI trajectories and genetic variants on NSCLC incidence. In this study, we comprehensively investigated the relationship between BMI trajectories and NSCLC risk in the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. In addition, we applied a genome-wide interaction analysis to further assess the effect of different BMI trajectories in participants stratified according to genetic variants on NSCLC risk, which can provide novel insights into the pathophysiology of NSCLC.

Methods

Study population

The PLCO Cancer Screening Trial is a population-based cohort study aimed to evaluate the accuracy and reliability of screening methods for prostate, lung, colorectal, and ovarian cancer, which randomly recruited 154,897 individuals aged 49–78 years from 10 centres in the USA between 1993 and 2001 [20]. Exclusion criteria included (i) personal history of cancer prior to trial entry (n = 11,803); (ii) individuals with missing BMI at any age (n = 3,504) or BMI < 15 or > 50 kg/m2 (n = 361); (iii) individuals failing to return or complete the baseline questionnaire (n = 669); (iv) individuals at enrolment with age < 50 years (n = 2); and (v) individuals with small cell lung cancer (n = 448). Ultimately, a total of 138,110 participants were retained for analysis. No included individuals had been diagnosed with lung cancer at the time of voluntarily joining the study. The diagnosis of NSCLC was histologically confirmed via medical record reviews, the National Death Index (for completeness), and self-reported annual questionnaires during follow-up [21]. This study was approved by the ethics committees of the PLCO consortium providers (PLCO-424). Additional information for the study subjects is presented in the Additional file 1: Appendix S1 [22].

BMI and BMI trajectories ascertainment

Height (m) and body weight (kg) at age 20, 50, and enrolment were collected from self-recorded questionnaires completed by the participants in the PLCO study (https://cdas.cancer.gov/datasets/plco/90/). BMI at each age period was calculated using the formula body weight (kg)/height (m2). Individuals were classified according to their BMI in each age period according to the World Health Organization 2000 criteria: underweight (<18.5 kg/m2), normal weight (18.5–24.9 kg/m2), overweight (25.0–29.9 kg/m2), and obesity (>30 kg/m2) [23]. To assess the relationship between pre-diagnostic BMI changes, defined as BMI from the age of 20 or 50 to the entry, and the risk of NSCLC at age 20, 50, and entry, latent class growth model (LCGM) was used to identify longitudinal patterns of BMI change at three-time points during adulthood [24]. Specifically, the LCGM here was fitted using linear and quadratic polynomials with three to five trajectory categories (individuals per trajectory ≥ 1%), and the model with the highest number of fitting categories was selected using the Bayesian Information Criterion (BIC) method and the average posterior probability (AvePP) of each trajectory [25]. Detailed information for the calculation of BMI trajectories is provided in the Additional file 1: Appendix S1 [26, 27].

Genotyping

The PLCO GWAS data were deposited in the database of Genotypes and Phenotypes (dbGaP, phs001286.v1.p1 and phs000336.v1.p1), including a total of 14,497 participants genotyped using Illumina Hap240, Hap300, and Hap550 [28, 29]. The use of the PLCO genetic datasets was approved by both the Internal Review Board of Nanjing Medical University and the dbGaP database administration (#21708 and #21643). Basic information on genotyping and imputation for PLCO GWAS data is shown in the Additional file 1: Appendix S1 [30,31,32]. Additional quality control procedures for individuals and single nucleotide polymorphisms (SNPs) levels are presented in the Additional file 1: Appendix S1. Ultimately, 13,365 individuals remained in the genetic analysis (Additional file 1: Table S1).

Analysis of the interaction between the GWAS-based polygenic risk score (PRS) and BMI trajectories

Based on 81 previously reported GWAS SNPs associated with lung cancer in European and Asian populations [16, 17], and a strict quality control process, including (i) SNPs located within autosomal chromosomes; (ii) minor allele frequency (MAF) ≥ 0.05; (iii) call rate ≥ 95%; (iv) P-value for Hardy-Weinberg Equilibrium (HWE) among non-NSCLC individuals ≥ 1.0×10−6; (v) imputation INFO > 0.3; and (vi) a risk effect consistent with previous results, we identified 19 independent [linkage disequilibrium (LD), r2 < 0.5] GWAS-identified SNPs (Additional file 1: Table S2) to construct the simple-count PRS (sPRS) and weighted PRS (wPRS) [16, 17, 33]. The sPRS is equal to the number of risk alleles, which can be estimated as \(sPRS=\sum \limits_{i=1}^I{G}_i\), where Gi (i.e. 0, 1, or 2) denotes the number of risk alleles of ith SNP. The wPRS was calculated using the formula: \(wPRS=\sum \limits_{i=1}^I{\beta}_i{G}_i\), where βi is the per allele ORs derived from previous studies [16, 17, 33]. Additional information on the analysis of the interaction between the PRSGWAS and BMI trajectory is presented in the Additional file 1: Appendix S1.

Genome-wide interaction analysis (GWIA)

GWIA was performed to test for the gene-environment interactions between genome-wide SNPs and BMI trajectories. The interaction was modelled by determining the multiplicative product of SNP genotype and BMI trajectories in the Cox proportional hazard regression model, adjusting for age, sex, race, family history of lung cancer, education, smoking status, personal history of diabetes, current marital status, study centre, and the first 10 principal components. For GWIA, the P-value of the interaction term < 1.0×10−6 was considered statistically significant [34]. Similar to the construction of PRSGWAS, the GWIA-based sPRS (sPRSGWIA) or wPRS (wPRSGWIA) was also calculated to evaluate the cumulative interaction effects with BMI trajectories, separately.

Functional annotation

Functional annotation was conducted to explore the potential molecular roles of the GWIA-identified loci by (i) pinpointing the most likely candidate genes at the identified loci by identifying cis-expression quantitative trait loci (cis-eQTL) within no more than 1 Mb of each investigated SNP in the Genotype-Tissue Expression project (version 7.0, http://www.gtexportal.org/home/) database from multiple relevant tissues [35, 36] and (ii) using the Encyclopedia of DNA Elements [37], HaploReg (version 4.1) [38], and RegulomeDB (http://www.regulomedb.org/) to further assess the regulatory potential for variants of interest.

Statistical analysis

Cox proportional hazards regression model was used to estimate the hazard ratio (HR) and 95% confidence intervals (CIs) between BMI trajectories and NSCLC risk with adjustments for age, sex, race, family history of lung cancer, education, smoking status, personal history of diabetes, current marital status, and study centre. The proportional hazard assumption was assessed by Schoenfeld residuals [39]. Further, continuous variables were adapted to conduct tests of linear trends. Individual follow-up time was defined as a period from entry until the time of NSCLC occurrence (diagnosis) or censoring defined as the exit of the study due to other causes or death, loss to follow-up, or the end of the study.

Interaction effects of PRSGWAS, PRSGWIA, or each GWIA-identified SNP with BMI trajectories were further investigated by adding multiplicative interaction terms in the Cox models with adjustment for the first 10 principal components. A cumulative incidence function was estimated using Kaplan-Meier technique to quantify the risk of developing NSCLC over time, stratified by GWIA-identified SNPs, and differences in the full time-to-event distributions between different BMI trajectory groups were compared by a log-rank test [40].

Subgroup analysis was performed to evaluate the heterogeneity of the association between BMI trajectories and NSCLC risk stratified by sex, smoking status, or histological type. Further, several sensitivity analyses were performed to assess the reliability of the primary results. One-sample Mendelian randomization (MR) analysis was also performed to access the causality between BMI trajectories and NSCLC risk, including inverse-variance-weighted (IVW), Mendelian randomization Egger (MR-Egger), and simple median method. P values (two-sided) < 0.05 were deemed significant. All analyses were performed using R 3.5.3 and PLINK 1.90 software. Additional information is presented in the Additional file 1: Appendix S1.

Results

There were 138,110 individuals in the prospective cohort study (Table 1). In total, 2641 NSCLC patients with a mean age of 64.34 years (SD = 5.20) were confirmed, including 2343 (88.72%) whites and 298 (11.28%) non-whites (184 blacks, 32 Hispanics, 63 Asians, and 19 others) populations. Compared with non-NSCLC individuals, NSCLC was more common among participants who were male (HR = 0.61, 95% CI: 0.56 to 0.66, P < 2×10−16), older (HR = 1.06, 95% CI: 1.05 to 1.07, P < 2×10−16), non-Hispanic Blacks (HR = 1.58, 95% CI: 1.36 to 1.84, P = 2.32×10−9), and current (HR = 24.22, 95% CI: 20.93 to 28.03, P < 2×10−16) or ever smoker (HR = 6.94, 95% CI: 6.01 to 8.01, P < 2×10−16); had a family history of lung cancer (HR = 1.83, 95% CI: 1.66 to 2.03, P < 2×10−16); had a low level of education (HR = 0.63, 95% CI: 0.58 to 0.69, P < 2×10−16); had a history of diabetes (HR = 1.25, 95% CI: 1.09 to 1.44, P = 0.001); and were divorced, separated, or widowed (HR = 1.41, 95% CI: 1.30 to 1.54, P = 1.08×10−14).

Table 1 Characteristics of the study subjects

No evidence of departure from the proportional hazard assumption in Cox models for NSCLC (P = 0.166) was found. Cox proportional hazards model showed that a higher BMI at 20 years, 50 years, and the time of enrolment (baseline) were associated with a decreased risk of NSCLC (HR = 0.88, P = 0.001; HR = 0.70, P < 2×10−16; HR = 0.75, P < 2×10−16, respectively), and similar findings were observed in categorical BMI (decreased risk in overweight and obesity, Additional file 1: Table S3). Further, we identified four distinct BMI trajectories by the latent class growth model (Fig. 1). Compared with participants with a stable normal BMI in their adulthood (n = 47,982, 34.74%), the risk of NSCLC decreased in participants who progressed from a normal BMI to an overweight BMI at baseline (n = 64,498, 46.70%, HR = 0.77, 95% CI: 0.70 to 0.84, P = 3.80×10−9), who progressed from a normal BMI to an obese BMI at baseline (n = 21,259, 15.39%, HR = 0.60, 95% CI: 0.53 to 0.69, P = 5.42×10−13), and who were overweight at the onset of adulthood and became obese at baseline (n = 4371, 3.16%, HR = 0.54, 95% CI: 0.40 to 0.74, P = 9.33×10−5). Interestingly, the NSCLC risk decreased gradually across all three BMI trajectories (HR for trend = 0.78, 95%CI: 0.74 to 0.83, P = 2×10−16) compared with subjects who maintained a normal BMI. Sensitivity analyses showed that the primary model retained a stable association between BMI trajectories and NSCLC risk (Additional file 1: Table S4). Furthermore, stratified analyses by sex, smoking status, and histological type showed almost no significant heterogeneity in the effect of age-specific BMI and BMI trajectories on NSCLC risk, although the P value for the heterogeneity test was less than 0.05 among those with BMI < 18.5 at baseline stratified by sex (Additional file 1: Figures S1-S3).

Fig. 1
figure 1

The latent class growth model of BMI trajectories in the PLCO study. A BMI changes for each participant in each trajectory group across three analysed age points (ages of 20 years, 50 years, and baseline). B Each trajectory was calculated at any of the three analysed age points (ages of 20 years, 50 years, and baseline). HR and 95% CI were estimated by Cox proportional hazards regression model with the adjustment for age, sex, race, family history of lung cancer, education, smoking, personal history of diabetes, current marital status, and study centre

Nineteen GWAS-identified SNPs were used to construct the PRS and examine the potential effect of BMI trajectories on NSCLC risk according to the genetic variants. The characteristics of 13,365 individuals from the GWAS are shown in Additional file 1: Appendix S1. Nineteen GWAS-identified SNPs associated with lung cancer were used to construct the sPRS and wPRS (Additional file 1: Table S2). Furthermore, compared with the low tertiles of sPRSGWAS, the middle and high tertiles of sPRSGWAS were associated with a higher probability of NSCLC (HR = 1.13, 95% CI: 1.12 to 1.59, P = 0.001; HR = 1.56, 95% CI: 1.34 to 1.82, P = 1.62×10−8, respectively) (Additional file 1: Table S5). Similar results were obtained for wPRSGWAS, indicating that a higher PRSGWAS was associated with an increased risk of NSCLC. However, there was no significant interaction between BMI trajectories and PRSGWAS with the NSCLC risk (PsPRS= 0.863 and PwPRS= 0.704; Additional file 1: Figure S4). Similar findings were observed for age-specific BMI (Additional file 1: Tables S6-S7).

GWIA was subsequently performed to investigate the effect of the genome-wide interaction between each SNP and BMI trajectories on the NSCLC risk. A Manhattan plot was constructed to show the significant SNPs that interacted with BMI trajectories (Additional file 1: Figure S5). Four independent SNPs reached statistically suggestive significance [34] instead of genome-wide significance in GWIA, which were also confirmed in the bootstrap and permutation tests (Additional file 1: Table S8). Among the four SNPs, rs79297227 with the lowest P value (1.01×10−7) located in SLC16A7 (12q14.1) showed a statistically suggestively significant interaction with the BMI trajectories, and the remaining three SNPs, including rs2336652 near CLASP2 (3p22.3, P = 3.92×10−7), rs16018 in CACNA1A (19p13.2, P = 3.92×10−7), and rs4726760 near BRAF (7q34, P = 9.19×10−7) interacted with the BMI trajectories in terms of the NSCLC risk. Similar results were obtained from the analysis stratified by genotype (Table 2). Figure 2B displays the cumulative incidence of NSCLC stratified by GWIA-identified SNPs by the log-rank test. In the sensitivity analysis, a significant interaction was observed between four SNPs and the BMI trajectories by additionally adjusting for occupation and family history of any cancer or performing other sensitivity analyses (almost P < 1.0×10−4, Additional file 1: Table S9). MR sensitivity analyses showed that the correlation direction between BMI trajectories and NSCLC risk was consistent with the above analysis, although no meaningful differences in these results were observed, with no evidence of directional pleiotropy (Additional file 1: Tables S10-S11). For the functional annotation, the search for cis-eQTLs at the four loci detected by GWIA showed that SNP rs4726760 at 7q34 was a strong cis-eQTL for BRAF (P = 0.011, β = 0.073) in the lung tissue. No cis-eQTL was found at the other three loci (rs16018, P = 0.070, β = 0.128; rs2336652, P = 0.854, β = −0.015; rs79297227, P = 0.376, β = −0.042) (Additional file 1: Figure S6A). SNP rs16018 is located on chromosome 19p13.2 in calcium voltage-gated channel subunit alpha1 A (CACNA1A), which is a protein-coding gene involved in calcium channel regulation; SNP rs2336652 at 3p22.3 is located near cytoplasmic linker-associated protein 2 (CLASP2),which is significantly expressed in lung tissue and promotes the stability of microtubules; and SNP rs79297227 at 12q14.1 is located in the solute carrier family 16 member 7 (SLC16A7), which is not only significantly expressed in lung tissues (Additional file 1: Figure S6B) but also expressed in various types of malignant tumours.

Table 2 Association between BMI trajectories and NSCLC risk stratified by the four susceptibility SNPs
Fig. 2
figure 2

Stratifications analysis for the interaction effects between BMI trajectories and GWIA-identified SNPs on NSCLC risk. A The identified four BMI trajectories from the onset of adulthood to the baseline. B Cumulative incidence of NSCLC stratified by GWIA-identified SNPs. P-value was derived from the Log-rank test. C Pathway of the gene (BRAF)-BMI trajectories interaction effect on the risk of NSCLC

GWIA-based PRS of the four SNPs above was constructed to evaluate the cumulative interaction with BMI trajectories on NSCLC risk (Fig. 3). Although a significant association was identified between BMI trajectories and a higher NSCLC risk among the individuals with high tertiles of wPRSGWIA (HR for trend =1.30, 95% CI = 1.10–1.54), interestingly, BMI trajectories were also associated with a decreased risk of NSCLC among individuals with a low (0.54, 0.47–0.62) or intermediate tertiles of wPRSGWIA (0.85, 0.72–0.99), indicating an obvious interaction between the GWIA-based wPRSGWIA and BMI trajectories. Similar findings were observed for age-specific BMI (Additional file 1: Table S12). The interaction between BMI trajectories and PRSGWIA with the NSCLC risk was significant (PsPRS = 6.61×10−5 and PwPRS = 3.80×10−16; Additional file 1: Figure S4). In addition, individuals with low or intermediate tertiles of wPRSGWIA experienced a gradually decreased cancer risk across the BMI trajectories from normal to normal, normal to overweight, overweight to obese, and normal to obese, while the high tertiles of wPRSGWIA were just the opposite after adjustment for age, sex, race, family history of lung cancer, education, smoking, personal history of diabetes, current marital status, study centre, and first 10 principal components (Fig. 3A, B). Stratification analyses for wPRSGWIA showed that associations between BMI trajectories and NSCLC risk were heterogeneous (I2 = 73.09%, P for heterogeneity < 0.001, Fig. 3B). Similar results were also observed in sPRSGWIA (Additional file 1: Figure S4CD, Table S13).

Fig. 3
figure 3

Interaction analysis and stratification analysis of BMI trajectories and the PRS constructed by four GWIA-identified SNPs on NSCLC risk. A, BwPRSGWIA were weighted according to the strength of their association with lung cancer. C, DsPRSGWIA were calculated by simple counting. P value for interaction was derived from multivariate-adjusted Cox proportional hazards regression model. PRS, polygenic risk score; GWIA, genome wide interaction analysis; SNP, single nucleotide polymorphism; HR, hazard ratio; CI, confidence interval

Discussion

In this multi-centre study, four distinct trajectories of BMI were identified during adulthood, finding that subjects who progressed from a normal BMI at the onset of adulthood to overweight or obesity at baseline (compared to maintaining a stable BMI) had a lower risk of developing NSCLC in this PLCO cohort (Fig. 2A). In addition, interaction analysis provided evidence that the association between BMI trajectories and NSCLC risk slightly differed according to genetic variation at SNPs rs4726760, rs16018, rs2336652, and rs79297227.

The results of this study suggested that the BMI trajectory from normal weight to overweight or obesity was associated with protective effects against NSCLC development, which was consistent with previous epidemiology studies [1, 2, 41,42,43]. Several hypotheses have been postulated to explain the relationship between leanness and a higher risk of lung cancer. For example, smoking, as a dominant risk factor for lung cancer, usually leads to lower body weight, which may explain the observed inverse BMI-lung cancer association. However, several large prospective studies show a negative association between BMI and lung cancer risk, and this association persists after excluding up to 10 years of follow-up, suggesting that it is not entirely due to smoking [44]. Moreover, never-smokers were more likely to have a stable normal BMI trajectory according to a stratified analysis of smoking status, although never-smokers in each BMI trajectory group accounted for about 50% of our analysis. Likewise, it has been suggested that weight loss represents a preclinical event prior to the clinical manifestation of lung cancer [45]. However, our sensitivity analysis suggested that BMI trajectories resulting in overweight or obesity were associated with a lower risk of lung cancer, even excluding patients who developed the disease during the first, second, or fourth year of follow-up. Interestingly, interaction analysis of PRSGWIA with BMI trajectories on NSCLC risk indicated that BMI progressed from normal to overweight or obesity was associated with higher NSCLC risk among individuals with the high tertiles of wPRSGWIA or sPRSGWIA. Specifically, they experienced a gradually increased NSCLC risk across the BMI trajectories from normal to normal, normal to overweight, overweight to obese, and normal to obese, although the low or intermediate tertiles of wPRSGWIA or sPRSGWIA were just the opposite (Fig. 3). In addition, those identified SNPs were located in or near genes that might be involved in biological pathways leading to lung cancer. The gene BRAF near rs4726760 provides instructions for making a protein that helps transmit chemical signals from outside the cell to the nucleus. This protein is a component of the extracellular signal-regulated kinase (ERK)/mitogen-activated protein kinase (MAPK) pathway, which regulates several important cell functions including cellular proliferation, differentiation, migration, and apoptosis. Chemical signalling through this pathway is essential for normal development before birth. BRAF also is an oncogene. When mutated, oncogenes have the potential to cause normal cells to become cancerous [46]. BRAF mutations are seen in 3–5% of NSCLC cases [47]. It is generally believed that obese people eat nutrient-rich foods, and studies have found that nutrients (antioxidants) can significantly inhibit the MAPK signalling pathway to reduce the inflammation response related to the risk of cancer [48]. The MAPK pathway plays an important role in the differentiation of adipocytes [49], and ERK is essential for the transcription of gene CCATT/enhancer binding protein α/β/δ and peroxisome proliferator-activated receptor gamma (PPARγ), key factors of adipocyte differentiation. When the ERK signalling pathway is activated, PPARγ is phosphorylated and transcriptional activity is reduced, which inhibits adipocyte differentiation [50]. Decreased adipocyte differentiation reduces the accumulation of adipocytes, thereby reducing the incidence of inflammation that may be related to pathological obesity (Fig. 2C).

The SNP rs16018, a member of the family of voltage-gated calcium channels, is located in the gene CACNA1A which is upregulated in numerous types of cancer including lung cancer [51]. The roles of calcium channels in various cell functions including mitogenesis, cell proliferation, differentiation, inflammation, and metastasis are well recognized [52]. Through calmodulin, intracellular calcium (Ca2+) levels regulate many different kinases, phosphatases, cyclases, esterases, and ion channels. Increased intracellular Ca2+ levels are correlated with cell proliferation, leading to inflammation and promoting carcinogenesis [51]. Subjects with a higher BMI may have sufficient nutritional status, and current studies have demonstrated that people with higher intake of nutrients (e.g. high dietary calcium) can modulate circulating calcitriol, thereby regulating intracellular Ca2+ levels [53], maintaining the balance of intracellular and extracellular Ca2+ concentrations and reducing the risk of lung cancer.

The SNP rs2336652, located near CLASP2, interacts with cytoplasmic linker protein, binds to microtubules, and has microtubule-stabilizing effects [54]. Increasing microtubule instability may cause genetic instability, and altered expression of CLASP2 may induce genetic instability and contribute to the development of lung cancer [55]. The variant rs79297227 is associated with the expression of SLC16A7. The SLC16A family of monocarboxylate transporters is a subfamily of solute carriers that transport monocarboxylate molecules, including L-lactate and pyruvate, across cell membranes [56]. Aberrant expression of SLC16A gene family members occurs in various types of malignant tumours and regulates cell migration, invasion, and proliferation [57,58,59].

MR analysis revealed non-significant associations between genetic polymorphisms affecting BMI and NSCLC. Although MR is considered a powerful tool to infer causality from nature’s randomization, it cannot completely avoid bias and confounders; thus, the results of MR studies warrant a cautious interpretation [60]. For example, BMI is strongly affected by smoking status, age, sex, and ethnicity [61]. However, confounding could not result in the genetic variant, and it is possible that attenuation of a protective effect against NSCLC has been caused by adjustment for mediators actually along the causal pathway or associated with collider bias [62]. In the end, the use of BMI variants in MR as proxies for BMI trajectories had inherent limitations due to the lack of previous GWAS studies on BMI trajectories, and insufficient PLCO genetic data despite the large sample in the PLCO cohort.

Our study had several strengths. First, this study was performed in a multi-centre, large sample size cohort. Second, we not only investigated the association between BMI trajectories and the NSCLC risk but also evaluated the interaction between BMI trajectories and genetic variants in the development of NSCLC. Third, we identified four novel and functionally plausible GWIA-based SNPs, which located near genes that paly critical roles in cell growth, differentiation, and inflammation and were mechanistically linked to BMI and NSCLC genesis. However, limitations of this study have also been identified. Similar to nearly all epidemiologic study on the subject, BMI at age 20 and 50 were obtained from individual’s self-report. However, that information was obtained before the subsequent development of the outcomes of interest, so recall bias could not have been operative. Second, a substantial number of exclusions could limit generalizability, while it constrained our study cohort to those with complete data available that should help mitigate against threats to internal validity. Third, residual for unmeasured confounding cannot be excluded even exhaustive adjustment was performed in the multivariable analyses. And conclusions from further Mendelian randomization, which purportedly provides a methodologic approach for causality inference, should also be treated with caution. Fourth, our findings have not been validated by other larger-sample epidemiological studies, especially the limited sample size of the PLCO GWAS data. Finally, additional functional studies are warranted to elucidate the mechanisms underlying the effects of these loci and BMI trajectories interactions on NSCLC risk.

Conclusions

Our study found that genetic susceptibility may modify the effect of BMI trajectories on the development of NSCLC by regulating cell growth, differentiation and inflammation. Further larger or multi-ethnicity studies should be conducted to validate our findings.