Clinical Relevance of Gastroesophageal Cancer Associated SNPs for Oncologic Outcome After Curative Surgery

Background Gastric and esophageal cancers are malignant diseases with rising importance in Western countries. To improve oncologic outcome after surgery, it is essential to understand the relevance of germline mutations. The aim of the study was to identify and distinguish clinically relevant single-nucleotide polymorphisms (SNPs). Patients and Methods In total, 190 patients with curative oncological resections of gastric and distal esophageal adenocarcinomas at Heidelberg University Hospital were eligible for this study. Outcome differences were determined for each SNP by analysis of clinical variables, survival, and mRNA expression levels. Results Significant survival differences were found on univariate analysis for usual prognostic variables (such as pTNM) and for six SNPs. On multivariate survival analysis, the SNPs rs12268840 (intron variant of MGMT, p = 0.045) and rs9972882 (intron variant of STARD3 and eQTL of PGAP3, p = 0.030) were independent and significant survival predictors along with R status and pT/pN category. Group TT of rs12268840 had the highest rate of second primary carcinoma (30.4%, p = 0.0003), lowest expression of MGMT based on cis-eQTL analysis in normal gastroesophageal tissue (p = 1.99 × 10−17), and worst oncologic outcome. Group AA of rs9972882 had the highest rate of distant metastases pM1 (42.9%, p = 0.0117), highest expression of PGAP3 (p = 1.29 × 10−15), and worst oncologic outcome. Conclusions Two intron variant SNPs of MGMT and STARD3 were identified that were significant survival predictors and may influence tumor biology. The data indicate that DNA methylation (MGMT) and malfunction of GPI anchoring (PGAP3) are distinct mechanisms that are relevant for tumor progression and relapse. Supplementary Information The online version contains supplementary material available at 10.1245/s10434-021-10771-y.

Genome-wide genotyping is a frequently used technique that has been widely practiced in the past. Recently, various single-nucleotide polymorphisms (SNPs) were identified that associate with either GERD, BE, GEJ, or GC. 6 Most studies utilize high case numbers and hundreds of thousands of SNPs, but do not address specific clinical parameters other than the rough differentiation between case and control groups. However, it is necessary to establish ties between genotyped data and oncological outcome by making use of systematic follow-up on a welldifferentiated cohort. While the number of allegedly relevant SNPs increases, it becomes harder to maintain an overview of those SNPs, which might also have a clinical significance and lead back to molecular pathways that are actually relevant. By correlating genotyped and imputed data with oncologic outcome parameters, we aim to identify clinically relevant loci.

Data Collection
Eligible subjects of this study were patients with GC or GEJ who underwent oncological resection of gastric or esophageal cancer between 2008 and 2017 at Heidelberg University Hospital, Department of General Surgery. All patients provided written consent for data collection and analysis. Patients were excluded who deceased during the early postoperative phase or in-hospital as oncologic outcome was assessed. The trial protocol was approved by the ethics committee at the University of Heidelberg and was performed in accordance with the Declaration of Helsinki and Good Clinical Practices as well as local ethics and legal requirements.

Genotyping
Prior to oncological resection, genomic DNA was extracted from peripheral blood leukocytes. Genotyping was performed by collaborating partners at the Department of Human Genetics, University of Bonn, Germany. It was accomplished for all patients using Infinium OmniExpress, Infinium OmniExpressExome, and Infinium Omni2.5Exome BeadChips (by Illumina) according to the manufacturer's protocol at the University of Bonn. The overlap of the SNP content was subjected to initial quality control performed with PLINK v1.90b6.6. Samples with genotype call rate less than 97%, discrepancies in sex, divergent ancestry from the CEU HapMap 2010 population, and related samples were excluded from further analysis.

Imputation
All unambiguous SNPs with SNP call rate [95%, minor allele frequency (MAF) [1%, and Hardy-Weinberg equilibrium (HWE) [ 0.001 were used for imputation. Imputation was performed with IMPUTE2 based on 1,000 Genomes Phase 3 as reference panel. As post-imputation quality control, we excluded all variants with an information score (INFO-score) \ 0.8, HWE \ 0.001, MAF \ 1%, and SNP missing rate [ 5% for best-guessed genotypes at posterior probability greater than 0.9 from further analysis.

Histopathology and Follow-Up
In all cases, the diagnosis of gastric and esophageal adenocarcinoma was histopathologically confirmed and the resected specimens were analyzed at the Department of Pathology, University of Heidelberg. The histopathological staging was performed according to the 8th UICC edition, including TNM categories, R status, and grading. 7 Inhospital mortality and postoperative complications were recorded during hospital course. To acquire long-term survival data, all patients were systematically followed up via continuous surveys. The relevant survival data for this patient collective were gathered until October 2020.

Statistical Analysis
We extracted the genotype information from the initial PLINK files by using Gtool v0.7.5. The data were collocated using Microsoft Office Excel, and statistical tests as well as plots were calculated thereafter via GraphPad Prism 9 for Mac OS X. We defined overall survival from time of first diagnosis until death or until the most recent follow-up using the Kaplan-Meier method. Differences in survival were analyzed via Mantel-Cox (logrank) comparisons on STATA/SE 15.0 for Mac. A level of significance of a = 0.05 was generally considered as statistically significant. For multivariate survival analysis, we established a Cox proportional hazards model with a level of significance of a \ 0.2. All clinical and SNP parameters from the univariate studies with a p-value \ 0.2 were included. The preoperatively assessed cTNM categories were excluded since a precise histopathology was available via the final pTNM histology.

cis-eQTL Analysis in Normal Tissue
For cis expression quantitative trait loci (cis-eQTL) analyses in normal tissue, we accessed the Genotype-Tissue Expression (GTEx) database (Version 8; Release date 18 July 2019: dbGaP Accession phs000424.v8.p2; http:// www.gtexportal.org/home/). All local genes ?/-1 Mb of the selected SNPs were extracted based on data provided by the GENCODE project. We retrieved the summary statistics of linear regression results via GTEx eQTL Calculator and generated violin plots via GTEx eQTL Dashboard with layout modifications via Adobe Illustrator CC 2018 (Version 22.0.0).

Comparative Study-Whole Collective, GC, and GEJ
In total, 61 out of 190 patients (32.1%) were diagnosed with GEJ (Siewert type I-III) and 129 patients (67.9%) had GC. Among GEJ patients, 82.0% were male, as opposed to 58.1% of GC patients (p = 0.0017). GEJ patients had a higher mean BMI of 26.9 kg/m 2 compared with GC patients (mean BMI 25.3 kg/m 2 , p = 0.0196). Besides neoadjuvant therapy (85.2% in GEJ versus 57.4% in GC, p = 0.0001), there were no significant differences between GEJ and GC patients regarding usual survival predictors. The results are demonstrated in Table 1 for the whole collective and separately for GEJ and GC in comparison, showing the significant variables that were also used in the latter multivariate survival analysis. It is important to note that both tumor entities GEJ and GC did not show any significant differences in overall survival (p = 0.9417) and disease-free survival (p = 0.3893). The corresponding Kaplan-Meier curves are demonstrated in Supplementary  Fig. 1a, b.

Overall Survival-Clinical Parameters
Median follow-up duration was 49.08 months (73.25 months for living cases and 20.74 months for deceased cases). The preoperative variables cT (p = 0.0045), cN (p = 0.0079), and cM category (p \ 0.0001) were significant predictors of survival. Neoadjuvant therapy (p = 0.0229) and type of operation (p \ 0.0001) were also clinical variables that were shown to have statistical significance. Finally, the histopathological parameters pT (p \ 0.0001), pN (p \ 0.0001), pM (p \ 0.0001), and R status (p \ 0.0001) were significant for outcome. All results are summarized in Supplementary Table 1A,B.

Overall and Disease-Free Survival-SNPs
In total, 68 SNPs, which were identified in the literature search, were analyzed by univariate survival analysis (Supplementary Table 2A,B). Six SNPs showed significant differences on univariate overall survival (OS) analysis. The SNP IDs, genetic information, and median survival time (in months) are summarized in Table 2.

Multivariate Regression
To further evaluate the significance of the identified SNPs, a multivariate regression analysis was performed including the clinical prognostic factors ( Table 3). The histological parameters pT (p = 0.020) and pN category (p = 0.001) as well as R status (p = 0.006) were statistically significant as well as the SNPs rs12268840 (p = 0.033) and rs9972882 (p = 0.041). In a confirmative Cox regression model, the parameters pT category (p = 0.009), pN category (p \ 0.001), R status (p \ 0.001), rs12268840 (p = 0.045), and rs9972882 (p = 0.030) were significant survival predictors.
Characterization and Clinical Differences for rs12268840 (MGMT) and rs9972882 (STARD3) Prognostic parameters such as the pT (p = 0.4966), pN (p = 0.1675), and pM category (p = 0.5984) or the R status (p = 0.6419) were not different between the three groups CC, TC, and TT for SNP rs12268840. However, we found that rates of second primary carcinomas (SPC) were significantly higher in the TT group (average rate 30.4%) as compared with 9.2% in the TC group and 4.0% in the CC group (p = 0.0003). Furthermore, the three groups differed significantly in preoperatively detected Laurén type (p = 0.0243).
In total, 42.9% of patients in group AA of SNP rs9972882 had distant metastases (pM1) on final histopathological workup as opposed to 12.3% in group AC and 13.5% in group CC (p = 0.0117). The remaining histological parameters pT (p = 0.1029) and pN category (p = 0.1226) as well as the R status (p = 0.6976) were not significantly different between the groups, either. All

Cis Expression Quantitative Trait Loci (cis-eQTL) in Normal Upper Gastrointestinal Tissue
To assess if the above-mentioned two SNPs were also able to change the level of gene expression, we further performed cis expression quantitative trait loci (cis-eQTL) analysis for all local genes in normal gastric and esophageal tissue (mucosa, muscularis, and gastroesophageal junction). The results are presented in Supplementary  Table 4A,B.
We found that rs12268840 was a highly significant cis-eQTL for MGMT in all upper gastrointestinal tissue types. This is supported by the normalized effect size (NES) of -0.60 in gastroesophageal junction tissue (p = 1.5 9 10 -23 ), meaning that the alternative allele (in this case, T) leads to a significantly lower expression of MGMT as compared with the reference allele C. Likewise, in esophageal mucosa NES was -0.64 (p = 3.5 9 10 -35 ), in esophageal muscularis -0.52 (p = 3.0 9 10 -28 ), and in gastric tissue -0.56 (p = 2.0 9 10 -17 ) (Fig. 2a).

DISCUSSION
In this study we retrieved SNPs from the current literature that were reported to be associated with either BE, GERD, GEJ, or GC. We then analyzed theses SNPs via genotyped and imputed data of curatively resected patients at our medical center. We were able to show that two intron variant SNPs of MGMT and STARD3 were significant predictors for overall survival but not for disease-free survival in gastroesophageal adenocarcinoma. Group TT of rs12268840 showed the highest rate of second primary carcinomas, lowest mRNA expression of MGMT in normal tissue, and worst oncologic outcome, implicating that rs12268840 could be involved in tumor progression. Group AA of rs9972882 had the highest rate of distant metastases (pM1), highest mRNA expression of PGAP3, and also worst outcome. The results are consistent with previous works and open a new perspective on clinically relevant germline SNPs that could directly affect carcinogenesis and tumor relapse in upper gastrointestinal cancer.
The SNPs that were selected by this literature-based approach have been identified as relevant for esophageal and gastric adenocarcinoma according to previous genomewide association studies (GWAS) on mostly large collectives. Similar attempts of correlating SNPs with clinical outcome parameters were made by Sung et al. in 2017. 8 The authors analyzed 11 SNPs that were previously identified as high-evidence genetic susceptibility markers for GC in a former meta-analysis. 9 Sung et al. were not able to find significant associations of these 11 SNPs with overall survival. However, the authors found subtype-specific associations for GC of the cardia region and diffuse-type GC. In our study, we confirmed usual prognostic factors such as pTNM category and R status on univariate survival analysis and all variables (except pM category) on multivariate analysis. Based on these results, we assume that this study's validity is ensured concerning the identification of further survival predictors. Only two SNPs, namely rs12268840 (intron variant of MGMT) and rs9972882 (intron variant of STARD3), were significant on multivariate analysis and will therefore be further discussed. According to the Cox regression model, both SNPs were independent survival predictors next to pT/pN categories and R status. SNP rs12268840 is located on chromosome 10 and is known as an intron variant of MGMT that encodes for the O-6-methylguanine-DNA methyltransferase (abbreviated as MGMT). DNA methylation is one of the most important types of epigenetic modification and plays a crucial role in carcinogenesis. 10 MGMT is a DNA-repair enzyme that protects cells from carcinogenic effects by removing DNA adducts of alkylating agents from the O-6 position of guanine. 11 Thus, studies have shown that loss of MGMT leads to an increased carcinogenic risk in mice. 12 For gastric cancer, Zhang et al. were able to demonstrate that MGMT silencing via promoter hypermethylation was significantly associated with an increased risk of GC. 13 Other works have shown that reduced expression of MGMT in esophageal adenocarcinoma organoids led to sensitivity towards temozolomide and taxane agents. 14 The study by Doecke et al. that initiated the inclusion of rs12268840 to this literature-based approach found that homozygous carriers of rs12268840 (i.e., group TT) with frequent acid reflux had a significantly higher risk of GEJ. 15 Our cis-eQTL analysis has shown that, in upper gastrointestinal tissues of healthy probands, group TT expressed the least MGMT mRNA in comparison with groups CC and CT (Fig. 2a). Furthermore, within the groups of rs12268840, there were significant different rates of second primary carcinomas (SPC). SPCs are defined as any type of malignant disease that was reported in the patient's past medical history and was treated beforehand. 16 Patients in group TT had the highest rate of SPC, while patients in group TC and CC had lower rates of SPC and significantly higher median OS. We interpret these results as group TT possibly having had a lack of MGMT to fight malignant diseases in the past medical history as well as in the postoperative follow-up. This may have led to a worse oncologic outcome in general.
Although SNP rs9972882 most likely qualifies as an intron variant of STARD3, it proved to be a significant cis-eQTL for PGAP3 according to mRNA expression in normal upper gastrointestinal tissue. In contrast, rs9972882 was not a significant eQTL for neighboring genes STARD3 or ERBB2 (Her2/neu).
PGAP3 transcribes the post-GPI attachment to proteins phospholipase 3, which is an enzyme for fatty acid remodeling of glycosylphosphatidylinositol (GPI)-anchored proteins (abbreviated as GPI-APs). GPI-APs are glycoproteins that are anchored to the outer layer of the plasma membrane and therefore exposed on the cell surface. Precisely, PGAP3 is responsible for the removal of unsaturated fatty acids at the sn-2 position of GPI-APs. Germline mutations of PGAP3 lead to a subtype of hyperphosphatemia and intellectual disabilities, also known as Mabry syndrome type 4. 17 Kwon et al. 18 have found that PGAP3 is co-amplified (together with STARD3, GRB7, and MIEN1) with ERBB2(Her2/neu) in esophageal and gastric cancer. Murakami et al. 19 were able to show that PGAP3 knockout mice had enhanced responses to alloreactive and antigen-specific stimuli. It is clear that GPI-APs play important roles in many biological processes such as signal transduction and cell-cell interaction. Our data could be interpreted that overexpression of PGAP3 within group AA of rs9972882 represents a malfunctioning of GPI anchoring. The classification of gastric cancer by The Cancer Genome Atlas (TCGA) Research Network into Epstein-Barr virus (EBV)-positive, microsatellite instability (MSI)high, genomically stable (GS), and chromosomally instable (CIN) subtypes has drawn attention to a general molecular differentiation of gastric cancer. 20 The differentiation of molecular subtypes could be especially relevant for the prediction of potential susceptibility for (neo-)adjuvant treatment. 21 So far, the CIN subtype was encountered more frequently in the gastroesophageal junction and cardia, 22 which was also the most frequent anatomic location of gastric adenocarcinoma in this study group. Due to its role as a DNA-repair enzyme, MGMT and its dysfunction may represent another aspect of the CIN subtype demonstrated by TCGA. However, tumor tissue and sequencing patterns would be necessary to properly investigate the effect of MGMT expression (and PGAP3, respectively) on actual prognosis for the separate TCGA subtypes. For this reason, genomic and tumorspecific data analysis will be necessary to infer that these results are generally applicable to all gastroesophageal cancer subtypes.
This study is based on a data collection of numerous clinical variables to enable a profound analysis of essential prognostic factors accompanied by reliable survival data from a systematic follow-up. Moreover, our analyses allow a linkage and translational approach to investigate histopathological, oncological, and genotyped features of upper gastrointestinal tumors. The ultimate goal of this study was to identify those genetic alterations that actually have a clinical implication in terms of a significantly different prognosis. However, it is necessary to remark that there may be a potential selection bias concerning those SNPs that could be of interest but were not investigated due to missing reports in the literature. Another limitation of this study is that both GEJ and GC were analyzed in the same collective in terms of testing for specific SNP markers. However, we were able to prove beforehand that both groups did not differ in decisive outcome parameters. Last but not least, the separate analysis of both groups GEJ and GC did not show any significant differences, neither for overall nor for disease-free survival. Previous works by our own group have demonstrated the comparability of both cancer entities regarding long-term oncological outcome. 23 This circumstance may also be in accordance with the insights by TCGA that distant esophageal adenocarcinoma and chromosomally unstable gastric adenocarcinoma (the most frequent TCGA subtype of gastric cancer) share many genomic amplifications and could possibly be considered a single disease entity. 24 By correlating the above-mentioned SNPs with our clinical data, we identified two clinically relevant SNPs. Our data suggest that the DNA-repair enzyme MGMT is relevant in gastroesophageal cancer and that PGAP3 is a potentially novel agent in carcinogenesis. In the future, it will be necessary to reproduce these results for larger and more homogeneous samples. Eventually, in vitro studies are required to comprehend the molecular mechanisms of MGMT and PGAP3. OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.