Genome-wide association study of subtype-specific epithelial ovarian cancer risk alleles using pooled DNA
- First Online:
- Cite this article as:
- Earp, M.A., Kelemen, L.E., Magliocco, A.M. et al. Hum Genet (2014) 133: 481. doi:10.1007/s00439-013-1383-3
Epithelial ovarian cancer (EOC) is a heterogeneous cancer with both genetic and environmental risk factors. Variants influencing the risk of developing the less-common EOC subtypes have not been fully investigated. We performed a genome-wide association study (GWAS) of EOC according to subtype by pooling genomic DNA from 545 cases and 398 controls of European descent, and testing for allelic associations. We evaluated for replication 188 variants from the GWAS [56 variants for mucinous, 55 for endometrioid and clear cell, 53 for low-malignant potential (LMP) serous, and 24 for invasive serous EOC], selected using pre-defined criteria. Genotypes from 13,188 cases and 23,164 controls of European descent were used to perform unconditional logistic regression under the log-additive genetic model; odds ratios (OR) and 95 % confidence intervals are reported. Nine variants tagging six loci were associated with subtype-specific EOC risk at P < 0.05, and had an OR that agreed in direction of effect with the GWAS results. Several of these variants are in or near genes with a biological rationale for conferring EOC risk, including ZFP36L1 and RAD51B for mucinous EOC (rs17106154, OR = 1.17, P = 0.029, n = 1,483 cases), GRB10 for endometrioid and clear cell EOC (rs2190503, P = 0.014, n = 2,903 cases), and C22orf26/BPIL2 for LMP serous EOC (rs9609538, OR = 0.86, P = 0.0043, n = 892 cases). In analyses that included the 75 GWAS samples, the association between rs9609538 (OR = 0.84, P = 0.0007) and LMP serous EOC risk remained statistically significant at P < 0.0012 adjusted for multiple testing. Replication in additional samples will be important to verify these results for the less-common EOC subtypes.
Epithelial ovarian cancer (EOC) is a heterogeneous cancer with distinct and clinically relevant subtypes that are characterized by differences in morphology, gene expression profile, and molecular genetic features (Gilks et al. 2008; Kalloger et al. 2010; Kobel et al. 2010). It has become apparent that the main histological subtypes, comprising ~70 % serous, 11 % endometrioid, 12 % clear cell, and 3 % mucinous EOC (Kobel et al. 2010), have different genetic (Gayther and Pharoah 2010; Lynch et al. 1991; Lynch et al. 1985; Shulman 2010) and epidemiologic (Faber et al. 2013) risk factors, precursor lesions (Pearce et al. 2012; Piek et al. 2001), pattern of spread, response to platinum-taxane based treatment, and patient outcome (Vaughan et al. 2011), compelling many to assert that they are different diseases (Gomez-Raposo et al. 2010; Kobel et al. 2010; Kurman and Shih 2010). Gene expression profiling has further classified serous EOC tumors into those of a less-common low-grade (3 % of all EOC) and more-common high-grade (68 % of all EOC) type (Kobel et al. 2010; Tothill et al. 2008), a finding supported by differences in the clinical behavior of these tumors (Matsuno et al. 2013). Much of the excess familial risk observed for EOC remains unexplained and may be improved by investigations that stratify by histological subtype.
Genome-wide association studies (GWAS) have identified several common susceptibility variants for EOC (Bolton et al. 2010; Goode et al. 2010; Permuth-Wey et al. 2013; Pharoah et al. 2013; Song et al. 2009). The majority of these have been most strongly associated with serous EOC, unsurprisingly given that the GWAS design carries forward for genotyping in subsequent stages those single nucleotide polymorphisms (SNPs) with the smallest P values associated with the most prevalent serous EOC subtype. Fewer genome-wide significant associations have been reported for the less-common EOC subtypes. In our most recent analyses of data from over 40 international studies of EOC within the Ovarian Cancer Association Consortium (OCAC), we reported that common susceptibility variants in the candidate HNF1B gene (Shen et al. 2013), and the candidate TERT locus (Bojesen et al. 2013) differentially associate with risk by subtype and imply that distinct mechanisms are involved in pathogenesis. Examining risk factors separately by histological subtype, together with assembling studies of large numbers of women with these cancers, is critical to understanding this disease.
To extend the findings of the existing EOC GWAS (Bolton et al. 2010; Goode et al. 2010; Pharoah et al. 2013; Song et al. 2009), we performed a GWAS according to EOC histological subtype using a DNA pooling strategy. The focus was to discover genetic variants associated with the less-common EOC subtypes [endometrioid, clear cell, mucinous, and low-malignant potential (LMP) serous].The DNA pooling strategy is an efficient approach to assess genetic associations and has been successfully performed for various disease subtypes (Pearson et al. 2007; Schrauwen et al. 2009; Skibola et al. 2009). In the first stage, individual DNA samples were physically combined to create subtype-specific case pools, and a control pool, and DNA pools (not individual samples) were assayed using commercially available SNP arrays. Data from arrays were used to estimate SNP allele frequencies or allelotypes for each DNA pool, and not to determine genotypes. Pool allelotypes were then used in allele-based tests to evaluate SNP associations with EOC subtypes. In the second stage, replication of allelic associations was performed by individual genotyping (IG) for a large number of women contributing samples to the OCAC.
Discovery stage study population
Participants were from the Ovarian Cancer in Alberta and British Columbia (OVAL-BC) population-based case–control study (abbreviated as “OVA”). Eligible cases had incident, histologically-confirmed EOC, and were identified from the provincial cancer registries of British Columbia (BC) and Alberta (AB) between 2002 and 2011. Eligible control women were identified from provincial health care enrollment rosters or from a province-wide mammography program (BC after 2005). Participants provided blood or saliva samples for DNA. Of 1,578 cases and 2,222 controls (response 64.9 and 55.6 %, respectively) in OVA, 545 cases and 398 controls were recruited in BC before June 30, 2008 and comprised the discovery stage sample. The study was approved by the Research Ethics Boards of the BC Cancer Agency, the University of British Columbia and the University of Calgary. All subjects gave written informed consent.
Discovery stage pool construction and quality control (QC)
Characteristics of samples in the discovery stage case–control DNA pools
Pool size (N)
Ageb (standard deviation)
84721, 84803, 84703, 90151
53 ± 13.2
56 ± 10.5
53 ± 12.8
84603, 84413, 84613
62 ± 10.1
57 ± 10.5
DNA was extracted from peripheral venous blood (90 % of subjects) using a modified salting out protocol (Sambrook 2000), and from saliva (10 % of subjects) using OraGene kits (DNA Genotek, PA, USA). Genotyping call rates were compared previously between OVA blood (99.7 %) and saliva (98.1 %) (unpublished). DNA samples were adjusted between 50 and 100 ng/uL and then precisely quantified in duplicate by fluorometry using PicoGreen™ (Molecular Probes, Eugene, OR, USA). For each EOC subtype, individual samples of 2–4 μL were manually pipetted into a single pool of 200 ng of DNA. Pools were assayed on Human660 W-Quad v1 (660-Quad) genotyping beadchips (Illumina, San Diego, CA, USA), and imaged at The Centre for Applied Genomics (Toronto, ON, Canada). The red and green channel intensities for each 660-Quad array were extracted and used to estimate the relative allele frequency (RAF) of SNPs for each DNA pool as red intensity/(red intensity + green intensity), following a previously described approach (Pearson et al. 2007). Each DNA pool was assayed using 12 replicate 660-Quad beadchips. Replicate arrays were used to reduce the error in RAF estimation (Earp et al. 2011). Details of the 660-Quad array QC are described in the online Supplementary Methods.
Discovery stage association analysis
The RAF of SNPs for each EOC DNA pool was compared with the RAF of SNPs for the control DNA pool, and allele-based tests were used to evaluate SNP associations according to subtypes using the SingleMarker test, implemented in the program GENEPOOL (Homer et al. 2008; Pearson et al. 2007). The SingleMarker test is a modified two-tailed Student’s t test that divides the difference in RAF of SNPs between cases and controls by the variance components specific to pooling (for example, variance due to pool construction, and variance due to SNP arrays) (Earp et al. 2011; Homer et al. 2008; Pearson et al. 2007).
SNP selection for replication stage
Number of pool-based GWAS SNPs selected for replication by EOC subtype
SNP selection methoda
# SNPs chosen
# SNPs successfully genotyped
# Independent tests of association
Replication stage study populations
Forty-three studies participating in OCAC contributed samples and data to the COGS effort. The OCAC studies have been described previously (Pharoah et al. 2013). All studies had data on disease status, age at diagnosis or interview, and histological subtype. Most studies frequency matched controls to cases on age group and race. Nine studies were case-only and were combined with case–control studies from the same geographical regions. Two Australian studies were also combined, creating 34 case–control sets.
Replication stage genotyping and QC
The COGS genotyping and QC process have been described (Pharoah et al. 2013). Briefly, OCAC samples were genotyped at two centers: McGill University and Genome Quebec Innovation Centre (Montreal, PQ) and the Mayo Clinic Medical Genome Facility (Rochester, MN) and genotype calling and QC were performed centrally at the University of Cambridge (Cambridge, UK). Of 47,630 OCAC samples genotyped, 44,308 passed QC. Concordance was >99.6 % among duplicates. Samples were excluded as follows: (1) a call rate of <95 %; (2) heterozygosity >5 standard deviations from the ancestry-specific mean; (3) ambiguous sex; (4) lowest call rate from a first-degree relative pair; (5) duplicate samples that were non-concordant for genotype or genotypic duplicates not concordant for phenotype. Of the 198 SNPs chosen by the current investigation, 188 (94.9 %) passed QC. SNPs were excluded if: (1) the call rate was <95 % with MAF >5 % or <99 % with MAF <5 %; (2) they were monomorphic; (3) P values of HWE in controls were <10−7; (4) there was >2 % discordance in duplicate pairs; or (5) no genotypes were called.
As an additional QC check, SNP-EOC associations detected in the pool-based GWAS data were compared with SNP-EOC associations evaluated using genotyped data for 915 of the 943 discovery samples (97 %) with sufficient DNA for genotyping. These samples were genotyped on the custom Illumina Infinium iSelect array as part of the COGS initiative, and evaluated with an allelic χ2 test (PLINK v1.07), the test most comparable to the SingleMarker test.
Replication stage association analysis
Analyses were further restricted to 36,352 eligible subjects (13,188 cases and 23,164 controls) of European descent. For each EOC subtype, SNP associations were estimated using unconditional logistic regression treating the number of minor alleles as an ordinal variable (log-additive model) and adjusting for population substructure by including the first five eigenvalues from principal components analysis (see (Pharoah et al. 2013)). Minor allele frequency for each SNP was calculated using genotypes for control subjects of European descent in OCAC. Separate analyses were carried out for each study within EOC subtype, and odds ratios (ORs) and 95 % confidence intervals (CIs) were then combined across studies using fixed effects meta-analysis. Analyses were performed including and excluding the OVA study (i.e., the source of the discovery stage samples). The I2 test of heterogeneity was estimated to quantify the proportion of total variation due to heterogeneity across studies, and the heterogeneity of ORs between studies was tested with Cochran’s Q statistic. The R statistical package rmeta was used to generate forest plots. Statistical analysis was conducted in PLINK (v1.07) (Purcell et al. 2007). Adjustment for multiple testing was performed using a Bonferroni correction of the Type I error. Because unique SNPs were selected for each EOC subtype, we treated each set of SNPs independently, and treated correlated (cluster) SNPs as one independent test (Table 2).
Associations between SNPs and subtype-specific ovarian cancer risk in discovery and replication stages
SNP (max > min)
OR (95 % CI)a
Rep + OVA
Rep + OVA
Rep + OVA
Rep + OVA
Rep + OVA
Rep + OVA
Rep + OVA
Rep + OVA
Rep + OVA
Endometrioid/clear cell subtype
LMP serous subtype
Invasive serous subtype
Twenty-four SNPs representing 15 loci were tested for association with 6,881 invasive serous EOC and 21,530 controls. None of these SNPs was associated with risk (data not shown).
Our objective was to discover risk alleles for the less-common EOC subtypes by performing a pool-based GWAS, followed by replication of associations using genotypes from 13,188 cases and 23,164 controls from OCAC. Nine SNPs tagging six loci were found to be associated with risk at P < 0.05 with ORs that agreed in direction of effect with the discovery stage samples. Only one of these, rs9609538, remained statistically significant with LMP serous EOC following correction for multiple testing in analyses that included the 75 discovery samples.
SNP rs9609538 was associated with decreased risk for LMP serous EOC and, in exploratory analyses, was not associated with any other subtype. This SNP lies on chromosome 22 within a 1 Mb region of 11 genes (YWHAH, LOC402057, SLC5A1, LOC150297, RFPL2, SLC5A4, RFPL3, C22orf28, BPIL2, FBXO7, and SYN3). The minor allele of rs9609538 is predicted to alter transcription factor (TF) binding site activity (multiple TFs including AIRE, AP-4, and CDP CR3) and miRNA binding site activity (hsa-miR-516a-5p and hsa-miR-548d-3p) based on FuncPred algorithms (Xu and Taylor 2009). SNP rs9609538 is positioned between C22orf28 (~500 bp upstream) and BPIL2 (5 bp downstream). BPIL2 is reported to be a rarely expressed lipid transfer/lipopolysaccharide binding protein, involved in recognizing the outer membrane of Gram-negative bacteria (Mulero et al. 2002). It was reported to be abnormally highly expressed in the inflamed skin of psoriasis patients, and implicated in the inflammation and/or immune response (Mulero et al. 2002). The relevance of inflammation processes to risk of LMP tumors was also recently suggested by the association of TNFSF10 (or TRAIL) with this EOC subtype (Charbonneau et al., in submission). C22orf28 encodes a tRNA-splicing ligase protein. Although BPIL2 seems a plausible candidate gene, fine mapping of the association in a larger sample followed by functional assays is needed to determine the gene targeted by this association, followed by further work to determine how it exerts its effects.
Although the other eight SNPs were not significantly associated with subtype-specific EOC risk following adjustments for multiple testing, several loci tagged by these SNPs are in or near genes that have a plausible biological rationale for influencing ovarian cancer pathogenesis. These include rs17106154, which lies within a ~150 kb LD region of ZFP36L1 (also known as BRF1, TIS11B, and Berg36). ZFP36L1 is highly expressed in the ovary (Hacker et al. 2010) and was identified as a VEGF mRNA-destabilizing protein (Planel et al. 2010). ZFP36L1 is altered in 7 % of adenoid cystic carcinomas (Ho et al. 2013) and only 1 % of invasive serous EOCs in The Cancer Genome Atlas is consistent with our finding that the SNP is associated with mucinous (a cystic tumor), but not invasive serous EOC. Three non-coding SNPs (rs2190503, rs6593140, rs2329554) tagging one locus upstream/intronic to GRB10 were associated with risk of endometrioid/clear cell EOC. GRB10 functions in the feedback inhibition of the PI3K/AKT and RAS/MAPK pathways (Hsu et al. 2011; Yu et al. 2011), and genes in these pathway are frequently mutated in endometrioid and clear cell tumors, and occasionally in serous tumors (Gilks 2010). GRB10 may be a tumor suppressor gene that acts in parallel with PTEN to ensure proper levels of activation of the PI3K/AKT pathway (Hsu et al. 2011; Yu et al. 2011). No SNPs were found to be associated with invasive serous risk in the replication samples. However, a previously reported GWAS SNP for invasive serous EOC with a moderately large OR (rs10088218, OR = 0.76) ranked highly in our pool-based data (ranked 3389 and in perfect LD with SNPs that ranked 295 and 445), but did not meet our stringent criteria for selection in replication.
There are several limitations to our study. First, the discovery stage sample size was small, reflecting the low incidence of the less-common EOC subtypes. We, therefore, combined endometrioid and clear cell samples for analysis and primarily investigated associations shared between these subtypes. Thus, associations unique to one subtype could not be evaluated. Second, despite rapid progress in recent years, robust histological subtyping remains a challenge for studies of EOC (Gilks et al. 2008; Gilks and Prat 2009; Han et al. 2008; Kobel et al. 2009, 2010). For example, many tumors that have previously been designated high-grade endometrioid are likely to be high-grade serous EOC (Kobel et al. 2009), and metastatic carcinoma from other organ sites is still difficult to correctly identify from primary mucinous EOC (Kelemen and Kobel 2011). Samples used in the pool-based stage of this study were reviewed using contemporary diagnostic criteria (Gilks et al. 2008); however, many of the OCAC replication studies including samples in our previous GWAS (Song et al. 2009) were not centrally-reviewed, and subtype misclassification may have introduced genetic heterogeneity and reduced statistical power. Third, the number of SNPs (maximum 200) chosen for replication was low. This was a factor restricted by cost and assay design across OCAC investigators and the three other consortia participating in the COGS initiative. The low coverage of genotyped SNPs per locus was also insufficient to allow imputation. Thus, additional genotyping of loci of interest will be needed to narrow down the regions of association.
This study also has several strengths. The DNA pooling design approach has successfully been applied in the GWAS context, including cancer GWAS (Brown et al. 2008; Skibola et al. 2009). The lack of identified risk alleles for EOC subtypes other than invasive serous prompted the current study, and we report a promising candidate for further interrogation for LMP serous EOC. Finally, the large number of EOC samples in OCAC—the largest assembled to date—and specifically of the less-common EOC subtypes, together with the coordinated genotyping and QC success rates achieved for over 200,000 samples and SNPs across four consortia, is a major strength of this study.
In conclusion, our pool-based GWAS of EOC risk according to subtype identified nine SNPs tagging six loci with suggestive associations in the mucinous (5 SNPs), endometrioid/clear cell (3 SNPs), and LMP serous (1 SNP) subtypes. Several tagged loci harbor genes that have a plausible biological rationale for conferring EOC risk. Further evaluation in additional samples will be important to verify these results for the less-common EOC subtypes.
Named individuals: This study would not have been possible without the contributions of the following: P. Hall (COGS); A. M. Dunning and A. Lee (Cambridge); J. Benitez, A. Gonzalez-Neira and the staff of the CNIO genotyping unit; D. C. Tessier, F. Bacot, D. Vincent, S. LaBoissière and F. Robidoux and the staff of the Genome Quebec genotyping unit; S. E. Bojesen, S. F. Nielsen, B. G. Nordestgaard, and the staff of the Copenhagen DNA laboratory; and J.M. Cunningham, S. A. Windebank, C. A. Hilker, J. Meyer and the staff of Mayo Clinic Genotyping Core Facility. We thank all the individuals who took part in this study and all the researchers, clinicians and technical and administrative staff who have made possible the many studies contributing to this work. In particular, we thank: D. Bowtell, A. deFazio, D. Gertig, A. Green, P. Parsons, N. Hayward, P. Webb, and D. Whiteman (AUS); G. Peuteman, T. Van Brussel, and D. Smeets (BEL); U. Eilber (GER); L. Gacucova (HMO); P. Schürmann, F. Kramer, W. Zheng, T.W. Park-Simon, K. Beer-Grondke, and D. Schmidt (HJO); J. Vollenweider (MAY); the MD Anderson Center for Translational and Public Health Genomics (MDA); the state cancer registries of AL, AZ, AR, CA, CO, CT, DE, FL, GA, HI, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY (NHS); L. Paddock, M. King, L. Rodriguez–Rodriguez, A. Samoila, and Y. Bensman (NJO); M. Sherman, A. Hutchinson, N. Szeszenia-Dabrowska, B. Peplonska, W. Zatonski, A. Soni, P. Chao, and M. Stagner (POL); C. Luccarini, P. Harrington, the SEARCH team and ECRIC (SEA); the Scottish Gynaecological Clinical Trails group and SCOTROC1 investigators (SRO); I. Jacobs, M. Widschwendter, E. Wozniak, N. Balogun, A. Ryan, and J. Ford (UKO); and Carole Pye (UKR).
Higher level funding: The COGS project is funded through a European Commission’s Seventh Framework Programme grant (agreement number 223175—HEALTH-F2-2009-223175). The Ovarian Cancer Association Consortium is supported by a grant from the Ovarian Cancer Research Fund thanks to donations by the family and friends of Kathryn Sladek Smith (PPD/RPCI.07). The scientific development and funding for this project were in part supported by the US National Cancer Institute GAME-ON Post-GWAS Initiative (U19-CA148112).
Investigator support: L.E.K. is supported by a Canadian Institutes of Health Research Investigator award (MSH-87734). G.C.-T. is supported by the National Health and Medical Research Council. B.Y. K. holds an American Cancer Society Early Detection Professorship (SIOP-06-258-01-COUN). F.M. is supported by a K-award from the National Cancer Institute (K07-CA080668). W.S. is supported by a K-award from the National Cancer Institute (K07-CA143047). D.F.E. is a Principal Research Fellow of Cancer Research UK.
Funding of constituent studies: This project was funded through grants from the Canadian Institutes of Health Research (MOP-86727, MOP-84340); WorkSafeBC 14, and OvCaRe: BC’s Ovarian Cancer Research Team. Funding of the constituent studies was provided by the American Cancer Society (CRTG-00-196-01-CCE); the California Cancer Research Program (00-01389 V-20170, N01-CN25403, 2II0200); Cancer Council Victoria; Cancer Council Queensland; Cancer Council New South Wales; Cancer Council South Australia; Cancer Council Tasmania; Cancer Foundation of Western Australia; the Cancer Institute of New Jersey; Cancer Research UK (C490/A6187, C490/A10119, C490/A10124, C536/A13086, C536/A6689); the Celma Mastry Ovarian Cancer Foundation; the Danish Cancer Society (94-222-52); the ELAN Program of the University of Erlangen-Nuremberg; the Eve Appeal (Oak Foundation); the Fred C. and Katherine B. Andersen Foundation; the German Cancer Research Center; the German Federal Ministry of Education and Research of Germany, Program of Clinical Biomedical Research (01 GB 9401); the Helsinki University Central Hospital Research Fund; Helse Vest; Imperial Experimental Cancer Research Centre (C1312/A15589); the L & S Milken Foundation; the Lon V. Smith Foundation (LVS-39420); the Mayo Foundation; the Mermaid I project; the Minnesota Ovarian Cancer Alliance; the National Health and Medical Research Council of Australia (199600, 209057, 251533, 396414, 400281, and 504715); Nationaal Kankerplan of Belgium; the Norwegian Cancer Society; the Norwegian Research Council; the OHSU Foundation; the Polish Ministry of Science and Higher Education (4 PO5C 028 14, 2 PO5A 068 27); Pomeranian Medical University; Radboud University Medical Center; the Roswell Park Cancer Institute Alliance Foundation; the Royal Marsden Hospital; the Rudolf-Bartling Foundation; the Sigrid Juselius Foundation; the state of Baden-Württemberg through Medical Faculty of the University of Ulm (P.685); the UK National Institute for Health Research Biomedical Research Centres at the University of Cambridge and the University College London Hospitals; the US Army Medical Research and Material Command (DAMD17-98-1-8659, DAMD17-01-1-0729, DAMD17-02-1-0666, DAMD17-02-1-0669, W81XWH-10-1-0280); the Department of Defense Ovarian Cancer Research Program (W81XWH-07-1-0449); the US National Cancer Institute (K07-CA095666, K22-CA138563, N01-CN55424, N01-PC067010, N01-PC035137, P01-CA017054, P01-CA087696, P30-CA15083, P50-CA105009, P50-CA136393, R01-CA014089, R01-CA016056, R01-CA017054, R01-CA049449, R01-CA050385, R01-CA054419, R01-CA058598, R01-CA058860, R01-CA061107, R01-CA061132, R01-CA063678, R01-CA063682, R01-CA064277, R01-CA067262, R01-CA071766, R01-CA074850, R01-CA076016, R01-CA080742, R01-CA080978, R01-CA083918, R01-CA087538, R01-CA092044, R01-095023, R01-CA106414, R01-CA122443, R01-CA112523, R01-CA114343, R01-CA126841, R01-CA136924, R01-CA149429, R03-CA113148, R03-CA115195, R37-CA070867, R37-CA70867, U01-CA069417, U01-CA071966 and Intramural research funds); the US National Institutes of Health/National Center for Research Resources/General Clinical Research Center (MO1-RR000056); and the US Public Health Service (PSA-042205).