Human Genetics

, Volume 131, Issue 2, pp 217–234

Meta-analysis of new genome-wide association studies of colorectal cancer risk

Authors

    • Cancer Prevention ProgramFred Hutchinson Cancer Research Center
    • Department of Epidemiology, School of Public HealthUniversity of Washington
  • Carolyn M. Hutter
    • Cancer Prevention ProgramFred Hutchinson Cancer Research Center
  • Li Hsu
    • Biostatistics and Biomathematics ProgramFred Hutchinson Cancer Research Center
  • Fredrick R. Schumacher
    • Department of Preventive Medicine, Keck School of MedicineUniversity of Southern California
  • David V. Conti
    • Department of Preventive Medicine, Keck School of MedicineUniversity of Southern California
  • Christopher S. Carlson
    • Cancer Prevention ProgramFred Hutchinson Cancer Research Center
  • Christopher K. Edlund
    • Keck School of MedicineUniversity of Southern California
  • Robert W. Haile
    • Department of Preventive Medicine, Keck School of MedicineUniversity of Southern California
  • Steven Gallinger
    • Department of Surgery, Toronto General HospitalUniversity Health Network
  • Brent W. Zanke
    • Clinical Epidemiology ProgramOttawa Hospital Research Institute
  • Mathieu Lemire
    • Ontario Institute for Cancer Research
  • Jagadish Rangrej
    • Ontario Institute for Cancer Research
  • Raakhee Vijayaraghavan
    • Translational Genomics Research Institute
  • Andrew T. Chan
    • Division of Gastroenterology, Massachusetts General HospitalHarvard Medical School
    • Channing LaboratoryBrigham and Women’s Hospital and Harvard Medical School
  • Aditi Hazra
    • Channing LaboratoryBrigham and Women’s Hospital and Harvard Medical School
    • Program in Molecular and Genetic Epidemiology, Department of EpidemiologyHarvard School of Public Health
  • David J. Hunter
    • Program in Molecular and Genetic Epidemiology, Department of EpidemiologyHarvard School of Public Health
  • Jing Ma
    • Channing LaboratoryBrigham and Women’s Hospital and Harvard Medical School
  • Charles S. Fuchs
    • Channing LaboratoryBrigham and Women’s Hospital and Harvard Medical School
    • Department of Medical OncologyDana-Farber Cancer Institute
  • Edward L. Giovannucci
    • Channing LaboratoryBrigham and Women’s Hospital and Harvard Medical School
    • Departments of Epidemiology and NutritionHarvard School of Public Health
  • Peter Kraft
    • Program in Molecular and Genetic Epidemiology, Department of EpidemiologyHarvard School of Public Health
  • Yan Liu
    • Dallas Research CenterStephens & Associates
  • Lin Chen
    • Department of Health StudiesUniversity of Chicago
  • Shuo Jiao
    • Cancer Prevention ProgramFred Hutchinson Cancer Research Center
  • Karen W. Makar
    • Cancer Prevention ProgramFred Hutchinson Cancer Research Center
  • Darin Taverna
    • Translational Genomics Research Institute
  • Stephen B. Gruber
    • Department of Internal MedicineUniversity of Michigan
  • Gad Rennert
    • Department of Community Medicine and EpidemiologyCarmel Medical Center and Technion Faculty of Medicine
  • Victor Moreno
    • Biostatistics and Bioinformatics UnitCatalan Institute of Oncology-IDIBELL
  • Cornelia M. Ulrich
    • Cancer Prevention ProgramFred Hutchinson Cancer Research Center
    • Department of Epidemiology, School of Public HealthUniversity of Washington
    • Division of Preventive OncologyGerman Cancer Research Center
  • Michael O. Woods
    • Discipline of Genetics, Faculty of MedicineMemorial University of Newfoundland
  • Roger C. Green
    • Discipline of Genetics, Faculty of MedicineMemorial University of Newfoundland
  • Patrick S. Parfrey
    • Discipline of Medicine, Faculty of MedicineMemorial University of Newfoundland
  • Ross L. Prentice
    • Division of Public Health SciencesFred Hutchinson Cancer Research Center
  • Charles Kooperberg
    • Division of Public Health SciencesFred Hutchinson Cancer Research Center
  • Rebecca D. Jackson
    • Division of Endocrinology, Diabetes and MetabolismOhio State University
  • Andrea Z. LaCroix
    • Cancer Prevention ProgramFred Hutchinson Cancer Research Center
  • Bette J. Caan
    • Division of ResearchKaiser Permanente Medical Care Program
  • Richard B. Hayes
    • Division of Epidemiology, Department of Environmental MedicineNew York University School of Medicine
  • Sonja I. Berndt
    • Division of Cancer Epidemiology and Genetics, Department of Health and Human ServicesNational Cancer Institute, National Institutes of Health
  • Stephen J. Chanock
    • Division of Cancer Epidemiology and Genetics, Department of Health and Human ServicesNational Cancer Institute, National Institutes of Health
  • Robert E. Schoen
    • Department of EpidemiologyUniversity of Pittsburgh Medical Center
  • Jenny Chang-Claude
    • Division of Cancer EpidemiologyGerman Cancer Research Center
  • Michael Hoffmeister
    • Division of Clinical Epidemiology and Aging ResearchGerman Cancer Research Center
  • Hermann Brenner
    • Division of Clinical Epidemiology and Aging ResearchGerman Cancer Research Center
  • Bernd Frank
    • Division of Clinical Epidemiology and Aging ResearchGerman Cancer Research Center
  • Stéphane Bézieau
    • Service de Génétique Médicale, Pôle de BiologieCentre Hospitalier Universitaire (CHU) de Nantes
  • Sébastien Küry
    • Service de Génétique Médicale, Pôle de BiologieCentre Hospitalier Universitaire (CHU) de Nantes
  • Martha L. Slattery
    • Department of Internal MedicineUniversity of Utah Health Sciences Center
  • John L. Hopper
    • Centre for Molecular, Environmental, Genetic, and Analytical EpidemiologyUniversity of Melbourne
  • Mark A. Jenkins
    • Centre for Molecular, Environmental, Genetic, and Analytical EpidemiologyUniversity of Melbourne
  • Loic Le Marchand
    • Epidemiology Program, Cancer Research Center of Hawai’iUniversity of Hawai’i at Manoa
  • Noralane M. Lindor
    • Department of Medical GeneticsMayo Clinic
  • Polly A. Newcomb
    • Cancer Prevention ProgramFred Hutchinson Cancer Research Center
  • Daniela Seminara
    • Division of Cancer Control and Population SciencesNational Cancer Institute
  • Thomas J. Hudson
    • Ontario Institute for Cancer Research
    • Departments of Medical Biophysics and Molecular GeneticsUniversity of Toronto
  • David J. Duggan
    • Translational Genomics Research Institute
  • John D. Potter
    • Department of Epidemiology, School of Public HealthUniversity of Washington
    • Division of Public Health SciencesFred Hutchinson Cancer Research Center
    • Department of Preventive Medicine, Keck School of MedicineUniversity of Southern California
Original Investigation

DOI: 10.1007/s00439-011-1055-0

Cite this article as:
Peters, U., Hutter, C.M., Hsu, L. et al. Hum Genet (2012) 131: 217. doi:10.1007/s00439-011-1055-0

Abstract

Colorectal cancer is the second leading cause of cancer death in developed countries. Genome-wide association studies (GWAS) have successfully identified novel susceptibility loci for colorectal cancer. To follow up on these findings, and try to identify novel colorectal cancer susceptibility loci, we present results for GWAS of colorectal cancer (2,906 cases, 3,416 controls) that have not previously published main associations. Specifically, we calculated odds ratios and 95% confidence intervals using log-additive models for each study. In order to improve our power to detect novel colorectal cancer susceptibility loci, we performed a meta-analysis combining the results across studies. We selected the most statistically significant single nucleotide polymorphisms (SNPs) for replication using ten independent studies (8,161 cases and 9,101 controls). We again used a meta-analysis to summarize results for the replication studies alone, and for a combined analysis of GWAS and replication studies. We measured ten SNPs previously identified in colorectal cancer susceptibility loci and found eight to be associated with colorectal cancer (p value range 0.02 to 1.8 × 10−8). When we excluded studies that have previously published on these SNPs, five SNPs remained significant at p < 0.05 in the combined analysis. No novel susceptibility loci were significant in the replication study after adjustment for multiple testing, and none reached genome-wide significance from a combined analysis of GWAS and replication. We observed marginally significant evidence for a second independent SNP in the BMP2 region at chromosomal location 20p12 (rs4813802; replication p value 0.03; combined p value 7.3 × 10−5). In a region on 5p33.15, which includes the coding regions of the TERT-CLPTM1L genes and has been identified in GWAS to be associated with susceptibility to at least seven other cancers, we observed a marginally significant association with rs2853668 (replication p value 0.03; combined p value 1.9 × 10−4). Our study suggests a complex nature of the contribution of common genetic variants to risk for colorectal cancer.

Introduction

Colorectal cancer is the second leading cause of cancer death in developed countries, with the lifetime risk estimated to be 5–6% (Ries et al. 2007). Linkage studies have identified important rare germline mutations, such as those in the APC gene and DNA mismatch repair genes, leading to severe syndromes, e.g. familial adenomatous polyposis and Lynch syndrome (also called hereditary non-polyposis colorectal cancer) (de la Chapelle 2004). However, these high-penetrance mutations explain only a small fraction of the genetic risk. To date, genome-wide association studies (GWAS) have identified 14 low-penetrance genetic variants that, together, explain approximately 8% of the familial association of this disease (Broderick et al. 2007; Gruber et al. 2007; Houlston et al. 2008, 2010; Tenesa et al. 2008; Tomlinson et al. 2007, 2008; Zanke et al. 2007). Based on a recent method by Chatterjee and Park (Park et al. 2010) that estimates the amount of familial association explained by common genetic variants, we estimate that about 60–70 common variants [95% confidence interval (CI) 31–173] would explain approximately 17% (95% CI 11.6–35.8%) of the familial association in colorectal cancer. Accordingly, we hypothesize that additional common colorectal cancer susceptibility loci exist that yet have to be identified, and that these loci can be identified through a genome-wide analysis of single nucleotide polymorphism (SNP) data.

As has been demonstrated in studies of other common complex diseases, power to detect novel loci is enhanced by performing meta-analysis that combines GWAS results (Zeggini and Ioannidis 2009). Therefore, we conducted a combined analysis of two recently completed scans that have not previously published main associations, followed by a replication study of the most significant findings using ten independent studies (Table 1; Supplemental Note; Supplemental Table 1) to follow up on the currently established colorectal cancer susceptibility loci and to try to identify additional susceptibility loci. The GWAS meta-analysis included a total of 2,906 cases and 3,416 controls recruited as part of the Colon Cancer Family Registry (CCFR), the Diet, Activity and Lifestyle Study (DALS), the Prostate, Lung, Colorectal and Ovarian Screening Trial (PLCO), and the Women’s Health Initiative (WHI). The replication included a total of 8,161 cases and 9,101 controls from the Nurses’ Health Study (NHS), the Health Professionals Follow-up Study (HPFS), the Physicians’ Health Study (PHS), the Assessment of Risk in Colorectal Tumors In Canada (ARCTIC), additional samples from DALS and CCFR, and case–control studies from Germany, France, Israel, and Newfoundland. Most of these studies are part of the Genetics and Epidemiology Colorectal Cancer Consortium (GECCO; details in Supplemental Note).
Table 1

Studies participating in the genome-wide association study (GWAS) and replication meta-analyses

Study name

Abbreviation

Cases

Controls

Total

GWAS

 Colon Cancer Family Registry

CCFR

1,191

999

2,190

 Women’s Health Initiative

WHI

483

530

1,013

 Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial

PLCO

534

1,168

1,702

 Diet, Activity and Lifestyle Survey

DALS

698

719

1,417

Replication

 Assessment of Risk in Colorectal Tumors in Canada

ARCTIC

769

665

1,434

 Colon Cancer Family Registry Set II

CCFR-II

780

780

1,560

 Darmkrebs: Chancen der Verhütung durch Screening

DACHS

1,731

1,742

3,473

 Diet, Activity and Lifestyle Survey Set II

DALS-II

691

720

1,411

 French Case–Control Study

FRENCH

954

1,060

2,014

 Health Professionals Follow-up Study

HPFS

333

595

928

 Molecular Epidemiology of Colorectal Cancer Study

MECC

1,686

1,779

3,465

 Newfoundland Familial Colon Cancer Registry

NFCCR

409

321

730

 Nurses’ Health Study

NHS

432

946

1,378

 Physician’s Health Study

PHS

376

493

869

Results

From the fixed-effects meta-analysis of GWAS scans, the inflation factor λ was 1.008, indicating little evidence of residual population substructure, cryptic relatedness, or differential genotyping between cases and controls (Supplemental Figure 1). When analyzed separately, λ was similarly low for each scan (range 1.005 to 1.01).

Initially, we attempted to validate the ten established susceptibility SNPs (p value 5 × 10−8) that had been published at the time we selected SNPs for replication (Broderick et al. 2007; Gruber et al. 2007; Houlston et al. 2008; Tenesa et al. 2008; Tomlinson et al. 2007, 2008; Zanke et al. 2007). We found nominal evidence for association in the same direction with p < 0.05 from combined analyses of GWAS and replication for eight of these ten loci (rs4939827/SMAD7, rs4779584/GREM1, rs16892766/EIF3H, rs3802842/11q23, rs961253/BMP2, rs4444235/BMP4, rs9929218/CDH1, rs6983267/MYC; Table 2). When we excluded results from studies that had been previously published (Supplemental Table 2; Supplemental Figure 2), we found evidence for replication at p < 0.05 for five out of nine SNPs. This latter analysis did not include rs6983267, since that SNP has already been published on by a majority of the studies.
Table 2

Risk estimates for colorectal cancer for previously reported genome-wide association studies (GWAS) hits

SNP

Chr

Position (bp)

Gene/locusa

Ref. allele

MAFb

# Casec

# Controlc

GWAS

Replication

GWAS + replication

OR (95% CI)

pd

OR (95% CI)

pd

OR (95% CI)

pd

rs4939827

18

44707461

SMAD7

A

0.49

7,675

8,970

0.86 (0.79–0.93)

8.7 × 10−5

0.90 (0.85–0.95)

2.1 × 10−4

0.88 (0.85–0.93)

1.1 × 10−7

rs4779584

15

30782048

SCG5, GREM1

G

0.19

7,669

8,795

1.15 (1.04–1.26)

5.2 × 10−3

1.19 (1.11–1.28)

8.1 × 10−7

1.18 (1.11–1.24)

1.8 × 10−8

rs16892766

8

117699864

EIF3H

A

0.08

7,686

8,977

1.41 (1.23–1.62)

6.8 × 10−7

1.15 (1.03–1.27)

9.3 × 10−3

1.24 (1.14–1.34)

4.0 × 10−7

rs3802842

11

110676919

LOC120376

A

0.28

7,677

8,782

1.13 (1.04–1.23)

4.1 × 10−3

1.14 (1.07–1.22)

2.7 × 10−5

1.14 (1.08–1.20)

3.8 × 10−7

rs961253

20

6352281

LOC643503, BMP2

C

0.35

7,683

8,790

1.10 (1.01–1.19)

0.020

1.10 (1.04–1.17)

9.0 × 10−4

1.10 (1.05–1.16)

5.2 × 10−5

rs4444235

14

53480669

BMP4

A

0.47

7,678

8,792

1.07 (0.99–1.15)

0.091

1.06 (1–1.12)

0.055

1.06 (1.01–1.11)

0.011

rs9929218

16

67378447

CDH1

G

0.30

7,668

8,770

0.95 (0.87–1.03)

0.24

0.94 (0.88–1)

0.042

0.94 (0.90–0.99)

0.020

rs10795668

10

8741225

LOC338591

G

0.32

7,681

8,790

0.94 (0.87–1.02)

0.14

0.98 (0.92–1.04)

0.50

0.97 (0.92–1.01)

0.15

rs10411210

19

38224140

RHPN2

G

0.10

7,685

8,795

0.88 (0.77–1.01)

6.4 × 10−2

0.98 (0.89–1.07)

0.62

0.95 (0.88–1.02)

0.14

rs6983267

8

128482487

POU5F1P1, MYC

C

0.49

4,166

4,990

0.88 (0.81–0.95)

7.5 × 10−4

0.80 (0.69–0.93)

3.3 × 10−3

0.86 (0.80–0.92)

1.4 × 10−5

GWAS hits defined as variants with p < 5 × 10−8

aNearest gene or locus for each SNP. For SNPs where two genes/loci are listed, the second gene represents a nearby gene thought to be a strong functional candidate

bMAF minor allele frequency based on controls of GWAS studies

c# of cases and controls differ for each SNP depending on genotyping platform used in replication and exclusions applied based on quality control measurements

dp value from fixed-effects meta-analysis

The most significant novel SNP in both the replication study, and in the combined analysis of the GWAS and replication, was rs7315438 located on chromosome 12q24 near MED13L [replication odds ratio (OR) = 0.92; replication p value 1.0 × 10−3; combined OR = 0.92; combined fixed-effects p value (pfixed) 5.6 × 10−6; combined random-effects p value (prandom) 1.5 × 10−3; Table 3; Supplemental Table 3]. For two other SNPs, their association with colorectal cancer was nominally significant within the replication study: rs4925386 located on chromosome 20q13 near LAMA5 (replication p value 2.5 × 10−3; pfixed 2.1 × 10−4; prandom 0.015) and rs16888522 located on 8q23.3 near EIF3H (replication p value 4.1 × 10−3; pfixed 1.7 × 10−5; prandom 4.5 × 10−4). We note that the rs4925386 SNP did not have strong evidence for association in the random-effects model. Since we selected rs4925386 for our replication study, it has been identified as being associated with colorectal cancer by a published GWAS (Houlston et al. 2010). The variant rs16888522 is in the region of rs16892766/EIF3H, previously identified to be associated with colorectal cancer by a GWAS (Tomlinson et al. 2008). The variant rs16888522/EIF3H was in weak linkage disequilibrium (LD) with rs16892766/EIF3H (D′ = 0.255; r2 = 0.043; Supplemental Figure 4). Conditional analysis, including both variants in the same model, resulted in less significant results for both variants (Supplemental Table 4) and showed weak correlation between the beta coefficients (r = −0.269), which suggests that these variants may not be independently associated SNPs.
Table 3

Risk estimates for colorectal cancer for loci with p < 5 × 10−4 in combined genome-wide association studies (GWAS) and replication analysis

SNP

Chr

Position (bp)

Gene/locusa

Ref. allele

MAFb

# Casec

# Controlc

GWAS (2,906 cases, 3,416 controls)

Replication (8,161 cases, 9,101 controls)

GWAS + replication

OR (95% CI)

pd

OR (95% CI)

pd

OR (95% CI)

pd

SNPs in regions of prior GWAS hits for colorectal cancer

 rs16888522

8

117643094

EIF3H

G

0.06

9,308

10,708

1.32 (1.13–1.54)

5.0 × 10−4

1.16 (1.05–1.28)

4.1 × 10−3

1.20 (1.11–1.31)

1.7 × 10−5

 rs4813802

20

6647595

BMP2

A

0.35

11,498

13,075

1.18 (1.09–1.28)

5.5 × 10−5

1.05 (1.01–1.10)

0.025

1.08 (1.04–1.13)

7.3 × 10−5

SNPs in new regions

 rs7315438

12

114375786

MED13L

A

0.42

9,255

10,657

0.88 (0.81–0.95)

1.2 × 10−3

0.92 (0.87–0.97)

0.001

0.90 (0.87–0.94)

5.6 × 10−6

 rs275454

5

6869013

LOC729434/POLS

G

0.37

11,348

12,925

1.16 (1.07–1.25)

2.9 × 10−4

1.06 (1.01–1.11)

0.014

1.08 (1.04–1.13)

8.7 × 10−5

 rs2373859

2

40471324

SLC8A1

A

0.35

10,999

12,483

0.87 (0.8–0.95)

8.8 × 10−4

0.95 (0.91–0.99)

0.010

0.93 (0.9–0.97)

1.4 × 10−4

 rs2853668

5

1353025

TERT-CLPTM1L

C

0.25

9,019

10,413

0.85 (0.78–0.93)

4.5 × 10−4

0.94 (0.88–0.99)

0.032

0.91 (0.87–0.96)

1.9 × 10−4

 rs4925386

20

60354439

LAMA5

G

0.30

10,485

11,629

0.91 (0.84–0.99)

0.032

0.93 (0.88–0.97)

2.5 × 10−3

0.92 (0.88–0.96)

2.1 × 10−4

 rs1525461

7

144359224

LOC643308/TPK1

A

0.19

10,978

12,483

1.20 (1.08–1.32)

3.6 × 10−4

1.06 (1.0–1.12)

0.037

1.09 (1.04–1.14)

3.6 × 10−4

aNearest gene or locus for each SNP. For SNPs where two genes/loci are listed, the second gene represents a nearby gene thought to be a strong functional candidate

bMAF minor allele frequency based on controls of GWAS studies

c# of cases and controls differ for each single-nucleotide polymorphism (SNP) depending on genotyping platform used in replication and exclusions applied based on quality control measurements

dp value from fixed-effects meta-analysis

We identified five other loci with p < 0.05 in our replication study and combined p value <10−4 (Table 3). The associated SNPs were located near BMP2, POLS, SLC8A1, TERT-CLPTM1L and TPK1. One was in a region previously identified to be a colorectal cancer susceptibility locus by a GWAS (rs4813802/BMP2) (replication p value 0.03; pfixed 7.3 × 10−5; prandom 0.014; Table 3; Supplemental Figure 3; Supplemental Table 3). The higher p value for the random effects for this SNP reflects the fact that we observed evidence of heterogeneity among GWAS for rs4813802 (I2 = 76.1% and p = 0.06) (Ioannidis et al. 2007); however, this was less pronounced among the replication studies (I2 = 40.4% and p = 0.08) suggesting that the result is consistent among studies after accounting for those that may be subject to the “winner’s curse” (Garner 2007). rs4813802/BMP2 was not in LD with the known colorectal cancer susceptibility SNP in the region rs961253/BMP2 (D′ = 0.02, r2 < 0.001) (Houlston et al. 2008), and the joint conditional analysis demonstrates the independent association of both variants with colorectal cancer risk (correlation of beta coefficients = 0.018; Supplemental Table 4; Fig. 1).
https://static-content.springer.com/image/art%3A10.1007%2Fs00439-011-1055-0/MediaObjects/439_2011_1055_Fig1_HTML.gif
Fig. 1

Regional association results and LD structure for the associated region on chromosome 20p12.3/BMP2 locus. The top half of the figure has physical position along the x-axis, and the −log10 of the meta-analysis p value on the y-axis. Each dot on the plot represents the result for one SNP. The bottom half of the figure shows pairwise linkage disequilibrium (LD) for the genotyped SNPs across the region. LD was measured as r2 and calculated using the control individuals from the WHI, PLCO and DALS samples recruited from 53 centers across the USA. Darker shading indicates higher levels of LD. The lines between the top and bottom half of the figure connecting the same SNPs

For all SNPs in Table 3, we tested if the risk estimates of these variants may vary by mode of inheritance or sex. While for some variants (rs4813802/BMP2, rs275454/POLS, rs2373859/SLC8A1, and rs2853668/CLPTM1L), the recessive model tended to provide stronger risk estimates and slightly lower p values than the log-additive or dominant model, the AIC value was >2 in all cases, indicating no statistical evidence for improvement over the log-additive model (Supplemental Table 5). We also explored if results vary by sex and found that for rs16888522/EIF3H the statistical evidence for association was stronger in men (OR = 1.25; p value = 0.002) than in women (OR = 1.10; p value = 0.25), although the effect estimates were in the same direction and similar in magnitude for both men and women (Supplemental Table 6).

As a sensitivity analysis, we reran the combined fixed-effects meta-analysis leaving out one study at a time for all SNPs in Table 2. In no case did the point estimate change >3%. Further, all pfixed remained <5 × 10−3 except for when we removed the French study from the analysis of rs4925386. In that case the OR remained similar (OR = 0.94) but the p value was slightly attenuated pfixed = 8.2 × 10−3.

Discussion

From the analysis of GWAS and replication, including a total of up to 11,067 cases and 12,517 controls, we found that SNPs in eight out of ten previously identified colorectal cancer susceptibility loci were associated with the disease in our replication study at p < 0.05. We found evidence that a second SNP (rs4813802) near the BMP2 gene could be associated with colorectal cancer, independent of the association with the previously identified susceptibility SNP in that region (rs961253). Furthermore, our study reports for the first time a potential new association of a variant in the TERT-CLPTM1L region with colorectal cancer risk.

Our results provide further support for eight of ten previously identified GWAS hits. When excluding studies that have previously published results on these known loci, five loci showed evidence of replication in this independent subsample. The 8q24 SNP rs6983267 has already been heavily studied, including published reports for many of the studies included in this paper (Figueiredo et al. 2011; Hutter et al. 2010), so we were not able to examine independent replication of this SNP in this study. Among the remaining four loci that did not show a significant association at p < 0.05, three showed a trend toward replication (with p < 0.2 and an OR in the same direction as the original GWAS report). However, one SNP, rs10795668, did not show any evidence for association with disease (OR = 1.00; 95% CI 0.93–1.08; p = 0.96; Supplemental Figure 2). Several papers have reviewed potential reasons for the lack of replication of GWAS findings (Chanock et al. 2007; Kraft et al. 2009). As in any observational study, it is possible this represents either a false positive in the initial report or a false negative in this replication; although that seems unlikely since both the discovery GWAS and this report are based on large, well-powered studies. We used the same genetic model and similar trait definitions as the discovery GWAS. Further, all studies were restricted to non-Hispanic Whites, limiting the possibility of differences in LD patterns. It is possible that there may be differences in the distribution of a key effect modifier between the studies used to identify rs10795668 and the studies presented in this paper. A full exploration of underlying gene–gene or gene–environment interactions is beyond the scope of the current paper, but we did explore if the effect of rs10795668 varied by sex. Although the results were not significant for either sex, and the 95% CIs overlap, we do note an interesting pattern where the ORs are in opposite directions for women and men, with men showing a trend in the direction of the discovery GWAS. Specifically, we found ORwomen = 1.07 (95% CI 0.92–1.26; pfixed = 0.38) and ORmen = 0.95 (95% CI 0.84–1.07; pfixed = 0.40).

The rs4925386/LAMA5 SNP was also recently identified in another GWAS meta-analysis (Houlston et al. 2010). Although it was not a known colorectal cancer susceptibility locus at the time we selected SNPs for replication, this SNP met our criteria for selection, and showed evidence for association in our replication sample. The rs4925386 variant lies in the intron of the large laminin A5 protein encoding gene. As previously reported the variant is in LD (r2 > 0.5) with four nonsynonymous SNPs in LAMA5 (Houlston et al. 2010). However, the prediction of each of these amino acid changes is proposed to be benign. Overall, our finding provides additional independent support that this variant is associated with susceptibility to colorectal cancer.

None of the loci were significantly associated with colorectal cancer in our replication study after adjusting for multiple testing (0.05/321 = 1.6 × 10−4), and none of the loci reached “genome-wide significance” at the suggested p values of 1.6 × 10−7 after accounting for the two-stage design (for details, see “Materials and methods”) (Dudbridge and Gusnanto 2008; Hoggart et al. 2008; International HapMap Consortium 2005; Pe’er et al. 2008; Risch and Merikangas 1996; Wellcome Trust Case Control Consortium 2007). However, for some of the variants with p < 0.05 in our replication and combined p value <10−4, additional lines of evidence provide support for the hypothesis that we may have identified genomic regions harboring causal variants for colorectal cancer susceptibility. The variant rs4813802 is about 295.3 kb centromeric to the previously identified rs961253/BMP2 GWAS hit (Houlston et al. 2008); both statistical models and LD data support the idea that these are independent signals. The closest gene is bone morphogenetic protein 2 (BMP2). The new variant of interest, rs4813802, is closer to BMP2 (49.2 kb upstream) than the previously identified SNP rs961253 (344.5 kb upstream of BMP2). Interestingly, rs4813802 lies within an ENCODE Digital DNAseI Hypersensitivity Cluster; it is also within an ENCODE region showing H3K4Me1 enhancer associated histone marks (Rosenbloom et al. 2010), and the flanking 15 bp shows strong placental mammal conservation by PhastCons (Siepel et al. 2005). While not conclusive, all of these are consistent with the region flanking the SNP acting as a long-range enhancer element, plausibly for BMP2. The BMP2 gene belongs to the transforming growth factor-β (TGFβ) superfamily, which plays an important role in cell proliferation, differentiation, and apoptosis (Massague 2000). SNPs in five out of the ten known colorectal cancer SNPs have chromosomal locations in or near TGFβ superfamily genes (Tenesa and Dunlop 2009). Furthermore, loss in BMP signaling has been reported at the transition from advanced adenoma to early cancer stage, compatible with a role in tumor progression (Hardwick et al. 2008). Support for a role for BMP signaling in colorectal cancer comes from the identification of mutations in the bone morphogenetic protein receptor, type IA protein (BMPR1A) in juvenile polyposis (Howe et al. 2001). Individuals with familial juvenile polyposis have a 20% risk of colon cancer by age 35 and 68% by age 60 (Schreibman et al. 2005). Our finding supports the possibility of allelic heterogeneity at the BMP2 locus, which is consistent with findings for the 8q24 cancer locus (Al Olama et al. 2009; Witte 2007) and recent findings for height showing evidence for allelic heterogeneity at as many as 19 loci (Lango et al. 2010). Similar to our finding, these 19 secondary signals in height were rather distant (on average 177 kb) from the initial index SNP that was found to be associated through GWAS (Lango et al. 2010). Accordingly, a comprehensive exploration of already discovered colorectal cancer loci may uncover additional independent variants. However, this example demonstrates that defining the boundaries of a susceptibility locus may be challenging, because the SNP we identified (rs4813802) would not have been included if we had defined the region around the initial index SNP (rs961253) by LD.

The 8q24 region has been shown to have multiple independent variants that are associated with cancers. Several of these variants are associated with more than one cancer, and some cancers are associated with multiple variants in this region (Al Olama et al. 2009; Witte 2007). Similarly, multiple variants associated with various cancer sites, including cancers of lung, pancreas, testes, and bladder, as well as glioma, basal cell carcinoma, and melanoma are found in the TERT-CLPTM1L region (Fig. 2) (Hsiung et al. 2010; Landi et al. 2009; McKay et al. 2008; Miki et al. 2010; Petersen et al. 2010; Rafnar et al. 2009; Shete et al. 2009; Stacey et al. 2009; Turnbull et al. 2010; Wang et al. 2008). Ours is the first report suggesting that a variant in the TERT-CLPTM1L region could be associated with colorectal cancer. The variant rs2853668 is 4.9 kb upstream of telomerase reverse transcriptase (TERT) and 18.0 kb downstream of cleft lip and palate transmembrane protein 1-like protein (CLPTM1L). Both genes have been implicated in cancer: CLPTM1L has been shown to be altered in cisplatin-resistant cell lines and potentially impacts apoptosis (Yamamoto et al. 2001); TERT encodes for the telomerase catalytic subunit that is important for the replication and stabilization of telomere ends, and subsequently impacts chromosome replication and suppression of cell senescence. Malfunction of telomerase can result in chromosomal abnormality and subsequent tumor formation (Rafnar et al. 2009). Our finding provides further evidence that TERT-CLPTM1L is a general cancer susceptibility locus that impacts critical function for cancer development, similar to the 8q24 region. The candidate gene in the 8q24 loci is MYC, and as noted by Johnatty et al. (2010), these two loci could act in concert. Specifically, the TERT promoter has several MYC (the nearest gene to the 8q24 locus) binding sites (Wu et al. 1999); however, we did not observe a statistically significant interaction between rs2853668/TERT-CLPTM1L and rs6983267/8q24, MYC (p for interaction term = 0.8).
https://static-content.springer.com/image/art%3A10.1007%2Fs00439-011-1055-0/MediaObjects/439_2011_1055_Fig2_HTML.gif
Fig. 2

Genetic variants associated with different cancer sites in the TERT-CLPTM1L region, including the new finding for colorectal cancer for rs2853668. This figure shows the genomic region on chromosome 5.p.15.33 including the two genes TERT and CLPTM1L. The top of the figure shows the genes with each exon represented by a short vertical line. The line below the genes shows the location of the different SNPs that have been associated with various cancer sites relative to the position of the genes (for instance, the first two SNPs are located in the last intron of TERT). The triangle shows the pairwise linkage disequilibrium (LD) of the SNPs. LD was measured as r2 and calculated using the control individuals from the WHI, PLCO and DALS samples recruited from 53 centers across the USA. Darker shading indicates higher levels of LD

The SNP rs7315438, which showed the most statistically significant association in both the replication study alone, as well as in the combined meta-analysis of GWAS and replication studies, is located on chromosome 12q24 about 76.9 kb upstream of the T-box 3 protein (TXB3). The SNP is also located 50.4 kb downstream of MED13L, which encodes for a subunit of the mediator complex, a large complex of proteins that functions as a transcriptional coactivator for most RNA polymerase II-transcribed genes. Since it has been implicated in transcription, this gene is a plausible candidate for further study. However, this SNP is in a large LD region containing numerous other potential candidate genes, including the kinase suppressor of RAS2 (KSR2).

Other SNPs identified as potentially associated with colorectal cancer in this study are rs27545 (POLS), rs2373859 (SLC8A1) and rs1525461 (LOC643308/TPK1). The gene closest to rs27545 is POLS (59 kb downstream), a DNA polymerase that is likely involved in DNA repair and, hence, provides a potentially interesting candidate gene (Hubscher et al. 2002). Other genes close to rs27545 are SRD5A1 (146 kb upstream), which converts testosterone into the more potent dihydrotestosterone, and the methyltransferase NSUN2 (183 kb downstream), which methylates tRNA (Brzezicha et al. 2006). The SNP rs2373859 resides in the intronic region of SLC8A1 also known as NCX1, which is a cell membrane protein that is involved in the rapid Ca(2+) transports (Annunziato et al. 2004). It is in a gene-rich region including other interesting candidates, such as MAP4K3 (954 kb upstream), a member of the mitogen-activated protein kinases, which is involved in regulating both cell growth and death and has altered gene expression in many cancer types (Cuadrado and Nebreda 2010) and SOS1, which may act as a positive regulator of RAS (Freedman et al. 2006). The closest gene to rs1525461 is TPK1 (195 kb upstream). TPK1 is involved in the regulation of thiamine metabolism (Timm et al. 2001). TPK1 flanks a gene-rich region, including several olfactory receptors but none of the genes has an obvious link to colorectal cancer development. However, the assignment of SNPs to candidate genes should be done with caution, as recently shown by additional fine mapping and in silico analysis of the previously identified colorectal cancer loci 8q23.3 (EIF3H), 16q22.1 (CDH1/CDH3), which suggested functional variation in unexpected candidate target genes (Carvajal-Carmona et al. 2011).

Overall, our study suggests a complex nature of the contribution of common genetic variants to risk for colorectal cancer, and suggests the need for additional studies to identify variants with marginal effects, as well as studies to examine potential sources and role of heterogeneity, including gene–gene and gene–environment interactions. We note that this study focused on the log-additive model. Although we present results for other genetic models for our top findings, our results may have been biased for SNPs that do not follow this assumed log-additive model (Minelli et al. 2005). Further, this study was not set up to investigate less frequent (allele frequency 1–5%) and rare variants (allele frequency < 1%), which have the potential to contribute substantially to the genetic susceptibility of colorectal cancer (Bodmer and Bonilla 2008; Cirulli and Goldstein 2010; Manolio et al. 2009).

In summary, we replicated the majority of SNPs that have previously been found to be associated with CRC in GWAS studies. We also report suggestive evidence for an additional independent signal for colorectal cancer risk in the BMP2 locus and a possible new association of colorectal cancer with a variant in the multi-cancer susceptibility locus around TERT-CLPTM1L. Future studies are needed to try to replicate these findings, and if successful, to identify the underlying variants directly responsible for the association, and to study the underlying molecular mechanisms.

Materials and methods

Study participants

The studies and their abbreviations are listed in Table 1, and each study is described in detail in the Supplemental Note. In brief, all cases were defined as colorectal adenocarcinoma (International Classification of Disease Code 153-154) and confirmed by medical records, pathologic reports, or death certificate. All cases and controls were self-reported as White, which was confirmed in GWAS samples based on genotype data. All participants gave written informed consent and studies were approved by the Institutional Review Board.

Study design

The GWAS meta-analysis results are based on two scans. One GWAS was conducted within the CCFR, including population-based cases and unrelated population-based controls from three sites: USA, Canada, and Australia (Figueiredo et al. 2011). In total, 1,191 cases and 999 controls were successfully genotyped on the Illumina 1M/1M Duo platform and passed all quality-control (QC) steps. The second scan was conducted across three US studies: the WHI and PLCO cohorts and the DALS population-based case–control study. A total of 1,715 colon cancer cases and 2,417 controls were successfully genotyped on the Illumina HumanHap 550K, 610K or combined Illumina 300K and 240K platforms and passed all QC steps. After applying rigorous genotyping QC filters (see below), a total of 378,739 directly genotyped SNPs commonly shared among the scans were included in the GWAS meta-analysis. To further boost the power and inform the ranking of SNPs, we included summary statistics from a previously published colorectal cancer GWAS (Colorectal Tumour Gene Identification Consortium, CORGI) in the meta-analysis (The Institute of Cancer Research 2008; Tomlinson et al. 2008). However, to ensure independence of results from prior published scans, we did not include any CORGI results in any of the presented ORs or p values.

Fixed-effects p values from the GWAS meta-analysis were used to select SNPs for replication. We rank ordered the top SNPs. We used LD information in our controls to prune out “redundant” signals (defined as r2 > 0.5 for SNPs ≥ 100 kb apart and r2 > 0.1 for SNPs < 100 kb apart). For the top five SNPs, with p < 10−5, we selected two other SNPs with r2 > 0.9 to ensure against potential genotyping failure. We then went down the ranked list until we filled our SNP platform (total number of SNPs selected for this project = 343). SNPs were excluded based on p value for heterogeneity < 0.001 (n = 1) and poor clustering in visual inspection of cluster plots (n = 3). If SNPs had a low design score, we replaced them with an alternative SNP with r2 > 0.9. The lowest ranked SNP had p value 1.2 × 10−3. Our platform also included SNPs for the ten known colorectal cancer susceptibility loci published in previous GWAS at the time we designed the platform. These 343 SNPs were genotyped in samples from DACHS, DALS, French, HPFS, NHS and PHS studies (N = 4,062 cases and 4,718 controls) (Table 1; Supplemental Note) and 306 SNPs were successfully genotyped in all studies (see details below). After we selected SNPs for replication, the ARCTIC genome-wide scan became available (769 cases and 665 controls), and we used imputed data from that study for analysis of the 343 SNPs (12 SNPs were not included due to low imputation quality or low HWE p values). As of April 2010, we had genotyped and analyzed the GWAS data and replication data from ARCTIC, DACHS, DALS and the French case–control study. We selected 32 SNPs with p < 0.1 in this replication set and/or a pfixed < 10−4 in the combined replication and GWAS for further genotyping in 2,550 cases and 3,539 controls, including additional samples from NHS, PHS, and HPFS, and samples from MECC and NFCCR. The top SNPs were also analyzed in a second set of data from the CCFR (780 cases and 780 controls). We present results for the total replication sample of 8,161 cases and 9,101 controls.

Genotyping

Genomic DNA was extracted from blood samples or, in the case of a subset of PLCO samples, from buccal cells using conventional methods.

GWAS for CCFR

Genotyping was completed on the Illumina Human1M and Human1M-Duo Bead Array in accordance with the manufacturer’s protocol.

Sample exclusions The following sample exclusion criteria were applied: call rate < 95% (n = 75), any stripe (physical/analytical location on BeadChip) call rate < 80% (n = 9), discordance with prior genotyping (n = 3), non-White (n = 29), samples that showed admixture identified using the program STRUCTURE (n = 33) (Falush et al. 2003; Pritchard et al. 2000), high identity by descent using PLINK (n = 2), and mismatch between called and phenotypic sex (n = 4). The final analysis was based on 1,191 cases and 999 controls.

SNP exclusions SNPs were excluded if they did not overlap between the Illumina Human1M and Human1M-Duo (n = 190,301), were annotated as “Intensity Only” (n = 8,263), had call rates < 90% on either the Illumina Human1M or Human1M-Duo (n = 9,229), or by study center or case–control status (n = 12,695). When further restricting analysis to SNPs with Hardy–Weinberg Equilibrium (HWE) p > 0.0001, MAF > 0.05, and SNP call rate > 0.98, a total of 739,733 SNPs remained in the analysis.

Average sample call rate was equal to 98.6% with >94% of samples having a call rate > 98%. Intra- and interplate replicate concordance rates were equal to 99.97 and 98.7%, respectively.

GWAS for DALS, PLCO and WHI

Genotyping was completed using Illumina HumanHap300 and HumanHap240S (PLCO), 550K (WHI, DALS) and 610K (DALS, PLCO) BeadChip Array System on the Infinium platform in accordance with the manufacturer’s protocol or as previously described for HumanHap300 and HumanHap240S (Yeager et al. 2007).

Sample exclusions Samples were excluded if the average call rate was <97% (DALS: n = 110, PLCO: n = 63, WHI: n = 66) or there was a mismatch between called and phenotypic sex (DALS: n = 6, PLCO: n = 1). To search for unexpected duplicates and closely related individuals we calculated identity-by-state values. We excluded unexpected duplicates (DALS, n = 2). Additionally, we excluded samples based on low concordance with prior genotyping (DALS: n = 10, WHI: n = 1) as well as samples that did not cluster with the CEU samples in principal component analysis including the three HapMap populations as a reference (DALS: n = 20, PLCO: n = 2, WHI: n = 6). The final analysis was based on 698 cases and 719 controls in DALS, 534 cases and 1,168 controls in PLCO, and 483 cases and 530 controls in WHI.

SNP exclusions Because we combined data from different platforms, we took precautions to exclude SNPs that do not perform consistently across platforms. This included SNPs reported by Illumina as not performing consistently across platforms (n = 78), SNPs found to have more than one discordant call across the 550K and 610K platforms in HapMap Data or our interplatform duplicates (n = 185); and SNPs with different MAF calls on the two platforms in our control populations (n = 9). We further filtered SNPs within each study (DALS, PLCO, WHI) based on MAF < 0.05% or HWE in controls < 0.0001. We applied a call rate per chip type per study of >98%. A total of 392,361 SNPs passed all QC checks for all three studies.

The average sample call rate was ≥98.8% in any of the three studies, and the concordance rate of blinded duplicates (n = 98 pairs) was >97%.

When we combined data across all scans, a total of 378,739 autosomal SNPs were successfully genotyped across all studies and used in our final GWAS meta-analysis of 2,906 cases and 3,416 controls.

Replication

Genotyping of 343 SNPs in DACHS, DALS, French, and the first sub-sets of HPFS, NHS and PHS were carried out using BeadXpress technology according to the manufacturer’s protocol. Problematic genotype clusters were visually inspected by lot number and the calling algorithm was adjusted, if indicated. 35 SNPs were excluded from the analysis due to poor cluster quality and 2 SNPs were excluded for being out of HWE (p < 0.0001) in controls of at least one study. The 306 SNPs in the replication had call rates > 92% across studies (average call rate per SNP per study 97.8%). MECC and NFCCR samples were genotyped using Matrix-assisted Laser Desorption/Ionization Time-of-Flight on the Sequenom® MassARRAY 7K platform (Sequenom, Inc., San Diego, CA). A total of 23 and 30 SNPs were successfully genotyped in MECC and NFCCR, respectively. Additional samples from NHS, HPFS and PHS were genotyped on 29 SNPs using the TaqMan® OpenArray® Genotyping Instrument Platform Assays (Applied Biosystems, Carlsbad, CA). Overall, 32 SNPs had call rates > 98% across studies (average call rate per SNP per study 99.5%; Supplemental Table 7), indicating excellent quality.

Two GWAS data sets (ARCTIC and CCFR II) became available after the GWAS meta-analysis and were used only for replication as described above. ARCTIC has been previously published (Zanke et al. 2007). Because ARCTIC was genotyped on the Affymetrix platform with limited overlap of SNPs with the Illumina platforms, we made use of imputed data for this study. Imputation was done with BEAGLE, using the phased HapMap release 22 as the reference sample (http://ftp.hapmap.org/phasing/2007-08_rel22/) (Browning and Browning 2009). SNPs were removed if they were out of HWE (p < 0.0001) in the controls (n = 1) or had an imputation r2 < 0.3 (n = 11). For CCFR phase II, samples were genotyped using the Illumina 1M Omni. Inclusion/exclusion criteria for cases in phase II were consistent to those described for phase I.

Statistical analysis

Study-specific analysis of GWAS data

To estimate the association between each genetic marker and risk for colorectal cancer we calculated ORs and 95% CIs using log-additive genetic models relating the genotype dose (0, 1 or 2 copies of the minor allele) to risk of colorectal cancer. We adjusted for age, sex (when appropriate), center and the first three principal components from EIGENSTRAT to account for population substructure. The CCFR calculated these estimates with Cochran–Mantel–Haenzsel analysis with strata defined by age, sex, and center.

Quantile–quantile (Q–Q) plots were assessed to determine whether the distribution of the p values in each study population was consistent with the null distribution (except for the extreme tail; Supplemental Figure 1). To quantify the data in the QQ plots, we calculated the inflation factor (λ) to measure the over-dispersion of the test statistics from association tests by dividing the mean of the test statistics by the mean of the expected values from a Chi-square distribution with 1 degree of freedom.

Combined analysis of GWAS

We conducted inverse-variance weighted fixed-effects meta-analysis to combine OR estimates from log-additive models or multiplicative methods across individual studies as described above. In this approach, we weighted the beta estimates of each study by their inverse variance and calculated a combined estimate by summing the weighted betas and dividing by the summed weights. We chose to focus on fixed effects because we only had a small number of studies. When the number of studies is small, the between study variance may be poorly estimated, resulting in deflated test statistics for association. As such, fixed-effects analysis is better powered for discovery of novel variants (Kraft et al. 2009). We calculated I2, which is a measure of the percentage of total variation across studies due to heterogeneity beyond chance, and obtained the heterogeneity p values based on Cochran’s Q statistic (Ioannidis et al. 2007).

Study-specific analysis of replication data

To estimate the association between each genetic marker and risk for colorectal cancer, we calculated ORs and 95% CIs using a log-additive genetic model relating the genotype dose (0, 1 or 2 copies of the minor allele) to risk of colorectal cancer and adjusting for age, sex, and study center (as appropriate) in logistic regression analysis.

Combined analysis of replication data

We conducted inverse-variance weighted fixed-effects meta-analysis to combine OR estimates from log-additive models across individual studies and measured heterogeneity using I2 and Cochran’s Q statistic, as discussed above.

Analysis of combined GWAS and replication data

We again combined across studies using inverse-variance weighted fixed-effects meta-analysis. For novel SNPs with p < 5 × 10−4 based on combined analysis of GWAS and replication, we also report random effects that incorporate potential heterogeneity into the effect estimate. For these SNPs, we also examined dominant, recessive and unrestricted genetic models and compared models by calculating the Akaike information criterion (AIC). We performed stratified analyses and evaluated whether the effects differed by sex. For novel SNPs in regions identified by previous GWAS, we also performed a conditional analysis including both the newly and previously identified SNPs in the region in one model to examine whether the effect of the newly identified SNP can be explained by the existing one. To quantify the independence of the novel SNPs from prior GWAS hits in the same region, we calculated the variance–covariance matrix and reported the correlation between the two betas. Finally, we performed a sensitivity analysis where we removed the studies one at a time and examined the results from the fixed-effect meta-analysis. We report any situations where removing one study resulted in a >5% change in the OR point estimate and/or reduced the p value of the combined fixed-effects meta-analysis to be <5 × 10−3, since that would indicate the results might be being driven by only one study.

Criterion for genome-wide significance

Based on an increasing number of papers (Dudbridge and Gusnanto 2008; Hoggart et al. 2008; International HapMap Consortium 2005; Pe’er et al. 2008; Risch and Merikangas 1996; Wellcome Trust Case Control Consortium 2007) providing a detailed discussion on the appropriate genome-wide significance threshold, which all arrive at similar values in the range of 5 × 10−7 to 5 × 10−8 for White populations, we decided to use a p value of 5 × 10−8 as genome-wide significance threshold. To account for the two-stage approach (GWAS and replication), we calculated that an overall p value of 5 × 10−8 is equal to a combined two-stage p value of 1.6 × 10−7 given our sample sizes in the GWAS and replication and a threshold for selecting SNPs from the GWAS of 1.2 × 10−3 as used here.

We used PLINK (Purcell et al. 2007; Purcell 2011) and R (R Development Core Team 2011) to conduct the statistical analysis and summarized results graphically using STATA (StataCorp 2009), snp.plotter (Luna and Nicodemus 2007), and LocusZOOM (Pruim et al. 2010).

Acknowledgments

The authors thank Dr. Ian Tomlinson at the Wellcome Trust Centre for Human Genetics, Oxford, UK, Dr. Richard Houlston at the Section of Cancer Genetics, Institute of Cancer Research, Sutton, UK, and Dr. Malcolm Dunlop at Colon Cancer Genetics Group, Institute of Genetics and Molecular Medicine, University of Edinburgh and Human Genetics Unit, Medical Research Council, Edinburgh, UK for providing access to GWAS summary statistics of the Colorectal Tumour Gene Identification Consortium (CORGI) and allow us to use these results to inform the ranking of the SNP selection for the replication.

ARCTIC: This work was supported by the Cancer Risk Evaluation (CaRE) Program grant from the Canadian Cancer Society Research Institute. TJH and BWZ are recipients of Senior Investigator Awards from the Ontario Institute for Cancer Research, through generous support from the Ontario Ministry of Research.

CCFR: This work was supported by the National Cancer Institute, National Institutes of Health under RFA # CA-95-011 and through cooperative agreements with members of the Colon Cancer Family Registry and P.I.s. This genome-wide scan was supported by the National Cancer Institute, National Institutes of Health by U01 CA122839. The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating centers in the CFRs, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government or the CFR. The following Colon CFR centers contributed data to this manuscript and were supported by the following sources: Australasian Colorectal Cancer Family Registry (U01 CA097735), Familial Colorectal Neoplasia Collaborative Group (U01 CA074799), Mayo Clinic Cooperative Family Registry for Colon Cancer Studies (U01 CA074800), Ontario Registry for Studies of Familial Colorectal Cancer (U01 CA074783), Seattle Colorectal Cancer Family Registry (U01 CA074794), University of Hawaii Colorectal Cancer Family Registry (U01 CA074806).

DACHS: This work was supported by grants from the German Research Council (Deutsche Forschungsgemeinschaft, BR 1704/6-1, BR 1704/6-3, BR 1704/6-4 and CH 117/1-1), and the German Federal Ministry of Education and Research (01KH0404 and 01ER0814). We thank all participants and cooperating clinicians, and Ute Handte-Daub, Belinda-Su Kaspereit and Ursula Eilber for excellent technical assistance.

DALS: This work was supported by the National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services (R01 CA48998 to MLS).

DALS, PLCO and WHI GWAS: Funding for the genome-wide scan of DALS, PLCO, and DALS was provided by the National Cancer Institute, Institutes of Health, U.S. Department of Health and Human Services (R01 CA059045 to UP). CMH was supported by a training grant from the National Cancer Institute, Institutes of Health, U.S. Department of Health and Human Services (R25 CA094880).

FRENCH: This work was funded by a regional Hospital Clinical Research Program (PHRC) and supported by the Regional Council of Pays de la Loire, the Groupement des Entreprises Françaises dans la LUtte contre le Cancer (GEFLUC), the Association Anne de Bretagne Génétique and the Ligue Régionale Contre le Cancer (LRCC).

GECCO: Funding for GECCO infrastructure is supported by National Cancer Institute, Institutes of Health, U.S. Department of Health and Human Services (U01 CA137088 to UP).

HPFS: This work was supported by the National Institutes of Health (P01 CA 055075 to C.S.F., R01 137178 to A.T.C., and P50 CA 127003 to C.S.F.). We acknowledge Patrice Soule and Hardeep Ranu for genotyping at the Dana-Farber Harvard Cancer Center High Throughput Polymorphism Core under the supervision of David J. Hunter, and Carolyn Guo for programming assistance.

MECC: This work was supported by the National Institutes of Health, U.S. Department of Health and Human Services (R01 CA81488 to SBG and GR).

NFCCR: This work was supported by an Interdisciplinary Health Research Team award from the Canadian Institutes of Health Research (CRT 43821); the National Institutes of Health, U.S. Department of Health and Human Services (U01 CA74783); and National Cancer Institute of Canada grants (18223 and 18226). The authors wish to acknowledge the contribution of Alexandre Belisle and the genotyping team of the McGill University and Génome Québec Innovation Centre, Montréal, Canada, for genotyping the Sequenom panel in the NFCCR samples.

NHS: This work was supported by the National Institutes of Health (P01 CA 087969 to ELG, R01 137178 to ATC, and P50 CA 127003 to CSF). We acknowledge Patrice Soule and Hardeep Ranu for genotyping at the Dana-Farber Harvard Cancer Center High Throughput Polymorphism Core under the supervision of David J. Hunter, and Carolyn Guo for programming assistance.

PHS: We acknowledge Patrice Soule and Hardeep Ranu for genotyping at the Dana-Farber Harvard Cancer Center High Throughput Polymorphism Core under the supervision of David J. Hunter, and Haiyan Zhang for programming assistance.

PLCO: This research was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics and supported by contracts from the Division of Cancer Prevention, National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services. The authors thank Drs. Christine Berg and Philip Prorok at the Division of Cancer Prevention at the National Cancer Institute, and investigators and staff from the screening centers of the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial, Mr. Thomas Riley and staff at Information Management Services, Inc., Ms. Barbara O’Brien and staff at Westat, Inc., and Mr. Tim Sheehy and staff at SAIC-Frederick. Most importantly, we acknowledge the study participants for their contributions to making this study possible.

Control samples were genotyped as part of the Cancer Genetic Markers of Susceptibility (CGEMS) prostate cancer scan were supported by the Intramural Research Program of the National Cancer Institute. The datasets used in this analysis were accessed with appropriate approval through the dbGaP online resource (http://cgems.cancer.gov/data_access.html) through dbGaP accession number 000207v.1p1.c1 (National Cancer Institute 2009; Yeager et al. 2007). Control samples were also genotyped as part of the GWAS of Lung Cancer and Smoking. Funding for this work was provided through the National Institutes of Health, Genes, Environment and Health Initiative [NIH GEI] (Z01 CP 010200). The human subjects participating in the GWAS are derived from the Prostate, Lung, Colon and Ovarian Screening Trial and the study is supported by intramural resources of the National Cancer Institute. Assistance with genotype cleaning, as well as with general study coordination, was provided by the Gene Environment Association Studies, GENEVA Coordinating Center (U01 HG004446). Assistance with data cleaning was provided by the National Center for Biotechnology Information. Funding support for genotyping, which was performed at the Johns Hopkins University Center for Inherited Disease Research, was provided by the NHI GEI (U01 HG 004438). The datasets used for the analyses described in this manuscript were obtained from dbGaP at http://www.ncbi.nlm.nih.gov/gap through dbGaP accession number ph000093.v2.p2.c1.

WHI: The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts N01WH22110, 24152, 32100-2, 32105-6, 32108-9, 32111-13, 32115, 32118-32119, 32122, 42107-26, 42129-32, 44221, and 268200764316C.

The authors wish to acknowledge Jacques Rossouw, Shari Ludlam, Joan McGowan, Leslie Ford, and Nancy Geller at the (National Heart, Lung, and Blood Institute, Bethesda, Maryland); the following Clinical Coordinating Center investigators: Kooperberg (Fred Hutchinson Cancer Research Center, Seattle, WA) Ross Prentice, Garnet Anderson, Andrea LaCroix, Charles Kooperberg, (Medical Research Labs, Highland Heights, KY) Evan Stein, and (University of California at San Francisco, San Francisco, CA) Steven Cummings; and (Wake Forest University School of Medicine, Winston-Salem, NC) Sally Shumaker with the Women’s Health Initiative Memory Study.

In addition, we wish to acknowledge the following Clinical Center investigators: Sylvia Wassertheil-Smoller (Albert Einstein College of Medicine, Bronx, NY); Haleh Sangi-Haghpeykar (Baylor College of Medicine, Houston, TX); JoAnn E. Manson (Brigham and Women’s Hospital, Harvard Medical School, Boston, MA); Charles B. Eaton (Brown University, Providence, RI); Lawrence S. Phillips (Emory University, Atlanta, GA); Shirley Beresford (Fred Hutchinson Cancer Research Center, Seattle, WA); Lisa Martin (George Washington University Medical Center, Washington, DC); Rowan Chlebowski (Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, CA); Erin LeBlanc (Kaiser Permanente Center for Health Research, Portland, OR); Bette Caan (Kaiser Permanente Division of Research, Oakland, CA); Jane Morley Kotchen (Medical College of Wisconsin, Milwaukee, WI); Barbara V. Howard (MedStar Research Institute/Howard University, Washington, DC); Linda Van Horn (Northwestern University, Chicago/Evanston, IL); Henry Black (Rush Medical Center, Chicago, IL); Marcia L. Stefanick (Stanford Prevention Research Center, Stanford, CA); Dorothy Lane (State University of New York at Stony Brook, Stony Brook, NY); Rebecca Jackson (The Ohio State University, Columbus, OH); Cora E. Lewis (University of Alabama at Birmingham, Birmingham, AL); Cynthia A. Thomson (University of Arizona, Tucson/Phoenix, AZ); Jean Wactawski-Wende (University at Buffalo, Buffalo, NY); John Robbins (University of California at Davis, Sacramento, CA); F. Allan Hubbell (University of California at Irvine, CA); Lauren Nathan (University of California at Los Angeles, Los Angeles, CA); Robert D. Langer (University of California at San Diego, LaJolla/Chula Vista, CA); Margery Gass (University of Cincinnati, Cincinnati, OH); Marian Limacher (University of Florida, Gainesville/Jacksonville, FL); J. David Curb (University of Hawaii, Honolulu, HI); Robert Wallace (University of Iowa, Iowa City/Davenport, IA); Judith Ockene (University of Massachusetts/Fallon Clinic, Worcester, MA); Norman Lasser (University of Medicine and Dentistry of New Jersey, Newark, NJ); Mary Jo O’Sullivan (University of Miami, Miami, FL); Karen Margolis (University of Minnesota, Minneapolis, MN); Robert Brunner (University of Nevada, Reno, NV); Gerardo Heiss (University of North Carolina, Chapel Hill, NC); Lewis Kuller (University of Pittsburgh, Pittsburgh, PA); Karen C. Johnson (University of Tennessee Health Science Center, Memphis, TN); Robert Brzyski (University of Texas Health Science Center, San Antonio, TX); Gloria E. Sarto (University of Wisconsin, Madison, WI); Mara Vitolins (Wake Forest University School of Medicine, Winston-Salem, NC); Michael S. Simon (Wayne State University School of Medicine/Hutzel Hospital, Detroit, MI).

Supplementary material

439_2011_1055_MOESM1_ESM.doc (2.7 mb)
Supplementary material 1 (DOC 2,798 kb)

Copyright information

© Springer-Verlag 2011