Gastroesophageal reflux disease (GERD) is a common gastrointestinal disorder, affecting approximately 13% of the worldwide population and 20% of the adult population in the western countries [1, 2]. GERD impairs the patients’ life quality and increases the risk of other esophageal complications, such as esophagitis, esophageal strictures, Barrett esophagus, and esophageal adenocarcinoma [1, 2]. Epidemiological studies have revealed several possible risk factors for GERD, including excess adiposity [3,4,5,6,7], diabetes [8], smoking [6, 7, 9,10,11,12], alcohol consumption [11, 13], and coffee and caffeine consumption [6, 14]. However, evidence on most associations is equivocal with inconsistent findings across studies [15,16,17,18,19]. Additionally, unobserved confounding, misclassification, reverse causality, and other biases may hinder causal inference in these associations in observational studies. Determining the causal link of potentially modifiable risk factors with GERD is of great importance in understanding the etiology of this disease as well as in preventing and managing the disease in the clinical settings.

Mendelian randomization (MR) design uses genetic variants as instrumental variables for an exposure and can strengthen the causal inference [20]. This design can diminish residual confounding since the genetic variants are randomly assorted at conception and therefore have limited correlations with environmental and self-adopted factors [20]. In addition, the MR design can minimize the possibility of reverse causality because genetic variants cannot be modified by the development or progression of the disease [20]. A recent MR study in the UK Biobank study found an association for waist-to-hip ratio, but not for body mass index (BMI), smoking or caffeine consumption with risk of GERD, whereas the analyses for smoking and caffeinated-coffee consumption were underpowered [21]. Here, we conducted a two-sample MR study to examine the associations of overall and central adiposity, diabetes, smoking and alcohol, coffee, and caffeine consumption with risk of GERD based on more GERD cases and genetic instruments that explain more phenotypic variances.


Study design

Figure 1 shows the study design overview. This study was based on summary-level data on measures of adiposity, type 2 diabetes, lifestyle factors, and GERD from published genome-wide association studies (GWASs) and consortia. All studies included in the cited GWASs and consortia had been approved by a relevant review board and involved participants had given informed consent. The present MR analyses were approved by the Swedish Ethical Review Authority (2019‐02,793).

Fig. 1
figure 1

Study design overview. LD, linkage disequilibrium; SNPs, single-nucleotide polymorphisms. There are three assumptions of Mendelian randomization design. The first assumption is that the genetic variants used as instrumental variables should be robustly associated with the exposure; the second assumption is that the used genetic variants should not be associated with any confounders; and the third assumption is that the selected genetic variants should affect the risk of the outcome merely through the risk factor, not via alternative pathways

Instrument variable selection

Genetic variants (i.e., single nucleotide polymorphisms, SNPs) associated with BMI [22], waist circumference [23], type 2 diabetes [24], smoking initiation [25], and alcohol [25], coffee [26], and caffeine [27, 28] consumption at the genome-wide significance level (p < 5 × 10–8) were obtained from corresponding GWASs (Table 1). Smoking initiation was defined as a binary phenotype representing whether an individual had ever smoked cigarettes regularly (current or past smoker) [25]. SNPs associated with waist circumference adjusted for BMI was used to examine the BMI-independent effect of waist circumference [23]. Linkage disequilibrium among the SNPs was estimated using 1000 Genomes European panel as the reference population. Independent SNPs (i.e., SNPs without linkage disequilibrium, defined by r2 <0.001 and clumping window > 10,000 kb) were used as instrumental variables.

Table 1 Information on used studies and consortia

Gastroesophageal reflux disease data source

Summary-level data on the associations of exposure-related SNPs with GERD were obtained from a genome-wide association meta-analysis of the UK Biobank study and QSkin Sun and Health Study including a total of 71 522 GERD cases and 261 079 controls of European descent [29]. GERD cases were defined by filed codes of self-report, International Classification of Disease 9 and 10, the Office of Population Censuses and Surveys, and treatment/medicine in the UK Biobank study (68 535 cases and 250 910 controls), and self-reported heartburn and medical records of reflux medications in QSkin Sun and Health Study (2987 cases and 10 169 controls) [29]. The GWAS analysis was adjusted for recruitment age, genetic sex, the first ten principal components, and cryptic relatedness.

Statistical analysis

The inverse variance weighted method was used as the main statistical method (the random-effects model for the exposure constructed by ≥ 3 SNPs and the fixed-effect model for the exposure constructed by < 3 SNPs). Several sensitivity analyses, including the weighted median [30], MR-Egger [31], MR-PRESSO [32], and contamination mixture method [33] were conducted to examine the consistency of associations and detect and correct for horizontal pleiotropy. Assuming > 50% of weight from valid SNPs, the weighted median method can provide consistent estimates [30]. MR-Egger analysis can generate pleiotropy-corrected estimates if the intercept test detects significant horizontal pleiotropy (p for intercept < 0.05); however, the model is usually underpowered [31]. MR-PRESSO method can detect outlying SNPs and provide estimates after removal of outliers [32]. The embedded distortion test can examine the difference in estimates before and after removing outliers [32]. The contamination mixture method can provide robust estimates in analysis using hundreds of SNPs as instrumental variables with the presence of invalid SNPs [33]. To assess whether genetic liability to type 2 diabetes is associated with GERD risk independently of BMI, we performed the multivariable MR analysis with adjustment for genetically predicted BMI. Cochrane’s Q was used to assess the heterogeneity of estimates of SNPs. Power was estimated using an online tool (Table 1) [34]. Associations with p value < 0.007 (0.05/7 exposures) were deemed significant associations, and associations with a p value ≥ 0.007 and ≤ 0.05 were regarded as suggestive associations. All tests were two-sided and performed using the TwoSampleMR [35], MR-PRESSO [32] and MendelianRandomization [36] packages in the R software (version 4.0.2).


Higher genetically predicted BMI and genetic predisposition to type 2 diabetes (not diabetes diagnosis) and smoking initiation was associated with an increased risk of GERD (Fig. 2). The odds ratio (OR) of GERD was 1.49 (95% confidence interval (CI), 1.40, 1.60; p = 4.09 × 10–33) for one standard deviation (SD) increase in BMI, 1.07 (95% CI, 1.04, 1.10; p = 5.73 × 10–7) for one-unit increase in log-transformed OR of type 2 diabetes (not diabetes diagnosis), and 1.41 (95% CI, 1.31, 1.52; p = 2.12 × 10–19) for one SD increase in prevalence of smoking initiation. The association for type 2 diabetes slightly attenuated but remained significant in the multivariable MR analysis with adjustment for genetically predicted BMI (OR, 1.05, 95% CI, 1.03, 1.08; p = 2.76 × 10–4). There were suggestive associations for genetically predicted waist circumference (OR per one SD increase, 1.14, 95% CI, 1.02, 1.26; p = 0.017) and caffeine consumption (OR per 80 mg increase, 1.08, 95% CI, 1.02, 1.15; p = 0.013) (Fig. 2). The association for waist circumference attenuated in the analysis using SNP associated with waist circumference adjusted for BMI (OR per one SD increase, 1.08, 95% CI, 0.98, 1.18; p = 0.134) (Fig. 2). We did not observe any association of genetically predicted alcohol or coffee consumption with risk of GERD in the main analysis (Fig. 2).

Fig. 2
figure 2

Associations of genetically proxied adiposity, type 2 diabetes, and lifestyle factors with risk of gastroesophageal reflux disease. BMI, body mass index; CI, confidence interval; OR, odds ratio. Estimates were obtained from the inverse variance weighted method with random-effects with the except for estimate for caffeine consumption that was obtained from the inverse variance weighted method with fixed-effects.

Associations for all exposures remained overall consistent in sensitivity analyses (Table 2). The association became stronger for waist circumference adjusted for BMI in MR-Egger analysis and for waist circumference in the contamination mixture analysis (Table 2). Horizontal pleiotropy was observed in the MR-Egger analysis of BMI and type 2 diabetes (p for intercept < 0.05) (Table 2). One to ten outliers were detected by MR-PRESSO analyses; however, the associations remained consistent after removal of these outliers and no difference in estimates before and after removing outliers was observed (p for distortion test > 0.05) (Table 2).

Table 2 Associations of genetically predicted risk factors with gastroesophageal reflux disease in Mendelian randomization sensitivity analyses


This MR study found that higher genetically predicted BMI and genetic liability to type 2 diabetes and smoking were associated with increased GERD risk. There were suggestive associations of genetically predicted higher waist circumference and caffeine consumption with an increased risk of GERD. The association for waist circumference attenuated after adjustment for genetically predicted BMI. Limited data was observed in support of an association of genetically predicted alcohol and coffee consumption with GERD.

Review articles of the association between obesity and GERD have found consistent evidence that overweight and obesity were associated with an ascended risk of GERD [3, 4, 17]. In a meta-analysis of 20 observational studies (mostly case–control and cross-sectional studies) with a total of 18 346 GERD patients, overweight and obesity was associated with a 57% and 115% higher risk of GERD, respectively [3], which is in line with our findings. This association is also supported by a cohort study including 29 610 Norwegians where weight loss was found to be associated with an increased odds of loss of GERD [5] and several randomized controlled trials [7]. However, a recent MR study of adiposity-related phenotypes in relation to GERD found that central adiposity, measured by waist-hip ratio instead of overall adiposity measured by BMI, showed a causal association with the increased risk of GERD [21]. Even though our study confirmed the positive association between central obesity measured by waist circumference and GERD, the association attenuated after adjustment for BMI. Thus, our data supported a stronger impact of overall adiposity, measured by BMI, compared to central adiposity on GERD. Several underlying mechanisms may explain the association between obesity and GERD, including esophageal motor abnormalities, low esophageal sphincter abnormalities, elevated intra-abdominal and intragastric pressures, increased frequency of transient sphincter relaxation, and esophageal inflammation [4, 37].

Gastrointestinal symptoms are frequently encountered in patients with type 2 diabetes. The association between type 2 diabetes and GERD was assessed in a meta-analysis including nine observational studies and the OR of GERD was 1.61 for diabetic individuals compared to non-diabetic controls [8]. Our MR study strengthened the causal nature of this positive association and further revealed that this association was independent of BMI. However, our MR estimate could not be compared with observational estimates since the risk of GERD was scaled to modelled liability to diabetes instead of diabetes diagnosis. Even though pathological pathways linking diabetes to GERD have not been fully investigated, the adverse effects of its upstream factors (e.g., obesity, smoking, etc.) on GERD and the autonomic neuropathy, especially vagal nerve damage in diabetic patients may explain this association [8, 38].

Evidence in support of an association between smoking and GERD was generally consistent [6, 7, 9,10,11,12]. A clinical trial in 14 smokers found that resuming smoking habits after abstaining from smoking for 48 h greatly increased acid reflux and heartburn [12]. The positive association between smoking and GERD was also revealed in case–control studies [11]. In The Trøndelag Health Study, smoking cessation was associated with an improvement in severe GERD in individuals with normal weight and anti-reflux medication, but not in individuals with minor GERD symptoms, overweight, or those using anti-reflux medication less than weekly [9]. Another study confirmed this association by comparing GERD improvement between the group that successful stopped smoking by varenicline and the group that failed to stop smoking [10]. Our MR study based on genetic data further strengthened the positive association between smoking and GERD even though we could not assess the interaction effects of body weight and the severity of the disease. A recent MR study did not observe this association, which might be caused by inadequate power [21]. Several mechanisms may decipher the increased risk of GERD in smokers, including reduced lower esophageal sphincter resting pressure (blocked cholinergic receptors by nicotine) and prolonged acid clearance time caused by reduced salivary secretion rate and bicarbonate concentration [39].

Consumptions of alcohol and coffee, two major beverages, have been associated with GERD risk with inconsistent or weak evidence for alcohol [11, 13, 15] and coffee consumption [6, 14, 15, 19]. A meta-analysis including 29 studies found an increased risk of GERD in regular alcohol drinkers compared to non- or occasional drinkers; however, this finding was majorly based on cross-sectional and case–control studies and therefore were prone to residual confounding from other alcohol intake-correlated lifestyles and behaviors as well as misclassification bias [13]. Our MR study did not observe a positive association between alcohol consumption and GERD risk, although we could not rule out the possibility that the observed null finding was led by an inadequate power or a possible association of heavy alcohol consumption or alcohol abuse with GERD. With regard to coffee consumption, our finding was in line with some [19] but not all [14] previous studies. Likewise, our null finding on coffee consumption is risked by an inadequate power. In a prospective analysis of data from Nurses’ Health Study II, consumption of coffee, tea, and soda (all contain caffeine) was associated with an increased risk of GERD symptoms, which may partly support the observed positive for caffeine consumption [6, 14], although the associations for coffee, tea, and soda in that study did not vary by caffeine status. Given that limited data examined the link from overall caffeine consumption to GERD risk, more study is needed.

Limitations need considerations when interpreting our results. The important limitation is the possible effect of horizontal pleiotropy. In this study, we observed significant pleiotropic effects in the analyses of BMI and type 2 diabetes, but not for other exposures. However, the associations for BMI and diabetes were consistent in sensitivity analyses. Anti-diabetes medications may be a pleiotropic source in the analysis of type 2 diabetes [40]. From the time logic (medications after disease onset), this is a vertical pleiotropy, which does not influence causal inference. Even though our analyses were based on a GERD dataset with a large sample size, we might overlook weak associations, especially for the exposure constructed by few SNPs that explain a small phenotypic variance. We had inadequate power in the analyses for waist circumference adjusted for BMI, and in the analyses of alcohol and coffee consumption. Our analysis included approximately 30% of GERD cases defined by self-reported information only in the UK Biobank, which might introduce outcome misclassification. However, the GERD genome-wide association study found strong genetic correlations across three GERD phenotypes defined by ICD10, self-reported GERD, and use of GERD medication, respectively (0.92 < rg < 0.97) [29], which indicated a good validity of self-report outcome data. Thus, the bias caused by self-report data should be minimal. Our study was confined to individuals of European descent, which reduced the population structure bias. On the other hand, this confinement may limit the generalizability of our findings to other populations. For smoking behaviors, the interaction effect with body weight and the severity of GERD could not be examined in this study based on summary-level data. For alcohol and coffee consumption, the effects of different types of alcohol or coffee could not be differentiated.

In summary, this MR study suggests causal roles of obesity, diabetes, and smoking in the development of GERD. The possible association between high caffeine consumption and an increased risk of GERD warrants confirmations.