Analysis of Population-Based Genetic Association Studies Applied to Cancer Susceptibility and Prognosis

  • Xavier Solé
  • Juan Ramón González
  • Víctor Moreno
Part of the Applied Bioinformatics and Biostatistics in Cancer Research book series (ABB)


Along hundreds of thousands of years, genetic variation has been the keystone for human evolution and adaptation to the surrounding environment. Although this fact has supposed a great progress for the species, mutations in our DNA sequence may also lead to an increased risk of developing some diseases with an underlying genetic basis, such as cancer. Among different genetic epidemiology branches, population-based association studies are one of the tools that can help us decipher which of these mutations are involved in the appearance or progression of the disease. This chapter aims to be a didactic but thorough review for those who are interested in genetic association studies and its analytical methodology. It will mainly focus on SNP-array analysis techniques, covering issues such as quality control, assessment of association with disease, gene–gene and gene–environment interactions, haplotype analysis, and genome-wide association studies. In the last part, some of the existing bioinformatics tools that perform the exposed analyses will be reviewed.


Single Nucleotide Polymorphism Lynch Syndrome Population Stratification Genetic Association Study Haplotype Block 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Affymetrix (2006) Brlmm: An improved genotype calling method for the genechip human mapping 500k array set. Technical Report, Affymetrix, Inc., Santa Clara, CAGoogle Scholar
  2. Akey J, Jin L, Xiong M (2001) Haplotypes vs single marker linkage disequilibrium tests: What do we gain? Eur J Hum Genet 9(4):291–300PubMedCrossRefGoogle Scholar
  3. Albert PS, Ratnasinghe D, Tangrea J, Wacholder S (2001) Limitations of the case-only design for identifying gene-environment interactions. Am J Epidemiol 154(8):687–693PubMedCrossRefGoogle Scholar
  4. Armitage P (1955) Tests for linear trends in proportions and frequencies. Biometrics 11(3): 375–386CrossRefGoogle Scholar
  5. Bailey-Wilson JE, Sorant B, Sorant AJ, Paul CM, Elston RC (1995) Model-free association analysis of a rare disease. Genet Epidemiol 12(6):571–575PubMedCrossRefGoogle Scholar
  6. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: Analysis and visualization of ld and haplotype maps. Bioinformatics 21(2):263–265PubMedCrossRefGoogle Scholar
  7. Beecham GW, Martin ER, Li YJ, Slifer MA, Gilbert JR, Haines JL, Pericak-Vance MA (2009) Genome-wide association study implicates a chromosome 12 risk locus for late-onset alzheimer disease. Am J Hum Genet 84(1):35–43PubMedCrossRefGoogle Scholar
  8. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate – a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300Google Scholar
  9. Bennett ST, Lucassen AM, Gough SC, Powell EE, Undlien DE, Pritchard LE, Merriman ME, Kawaguchi Y, Dronsfield MJ, Pociot F, et al (1995) Susceptibility to human type 1 diabetes at iddm2 is determined by tandem repeat variation at the insulin gene minisatellite locus. Nat Genet 9(3):284–292PubMedCrossRefGoogle Scholar
  10. Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H, Giresi PG, Goldy J, Hawrylycz M, Haydock A, Humbert R, James KD, Johnson BE, Johnson EM, Frum TT, Rosenzweig ER, Karnani N, Lee K, Lefebvre GC, Navas PA, Neri F, Parker SC, Sabo PJ, Sandstrom R, Shafer A, Vetrie D, Weaver M, Wilcox S, Yu M, Collins FS, Dekker J, Lieb JD, Tullius TD, Crawford GE, Sunyaev S, Nobl WS (2007) Identification and analysis of functional elements in 1% of the human genome by the encode pilot project. Nature 447(7146):799–816PubMedCrossRefGoogle Scholar
  11. Broderick P, Carvajal-Carmona L, Pittman AM, Webb E, Howarth K, Rowan A, Lubbe S, Spain S, Sullivan K, Fielding S, Jaeger E, Vijayakrishnan J, Kemp Z, Gorman M, Chandler I, Papaemmanuil E, Penegar S, Wood W, Sellick G, Qureshi M, Teixeira A, Domingo E, Barclay E, Martin L, Sieber O, Kerr D, Gray R, Peto J, Cazier JB, Tomlinson I, Houlston RS (2007) A genome-wide association study shows that common alleles of smad7 influence colorectal cancer risk. Nat Genet 39(11):1315–1317PubMedCrossRefGoogle Scholar
  12. Cai J, Zeng D (2004) Sample size/power calculation for case-cohort studies. Biometrics 60(4):1015–1024PubMedCrossRefGoogle Scholar
  13. Carvalho B, Bengtsson H, Speed TP, Irizarry RA (2007) Exploration, normalization, and genotype calls of high-density oligonucleotide snp array data. Biostatistics 8(2):485–499PubMedCrossRefGoogle Scholar
  14. Celeux G, Diebolt J (1985) The sem algorithm: A probabilistic teacher derived from the em algorithm for the mixture problem. Comput Stat Q 2:73–82Google Scholar
  15. Chanock SJ, Manolio T, Boehnke M, Boerwinkle E, Hunter DJ, Thomas G, Hirschhorn JN, Abecasis G, Altshuler D, Bailey-Wilson JE, Brooks LD, Cardon LR, Daly M, Donnelly P, Fraumeni J J F, Freimer NB, Gerhard DS, Gunter C, Guttmacher AE, Guyer MS, Harris EL, Hoh J, Hoover R, Kong CA, Merikangas KR, Morton CC, Palmer LJ, Phimister EG, Rice JP, Roberts J, Rotimi C, Tucker MA, Vogan KJ, Wacholder S, Wijsman EM, Winn DM, Collins FS (2007) Replicating genotype–phenotype associations. Nature 447(7145):655–660PubMedCrossRefGoogle Scholar
  16. Churchill GA, Doerge RW (1994) Empirical threshold values for quantitative trait mapping. Genetics 138(3):963–971PubMedGoogle Scholar
  17. Clark AG (1990) Inference of haplotypes from pcr-amplified samples of diploid populations. Mol Biol Evol 7(2):111–122PubMedGoogle Scholar
  18. Clayton D, McKeigue PM (2001) Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet 358:1356–1360PubMedCrossRefGoogle Scholar
  19. Clayton DG, Walker NM, Smyth DJ, Pask R, Cooper JD, Maier LM, Smink LJ, Lam AC, Ovington NR, Stevens HE, Nutland S, Howson JM, Faham M, Moorhead M, Jones HB, Falkowski M, Hardenbol P, Willis TD, Todd JA (2005) Population structure, differential bias and genomic control in a large-scale, case–control association study. Nat Genet 37(11):1243–1246PubMedCrossRefGoogle Scholar
  20. Consortium WTCC (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 controls. Nature 447:661–678CrossRefGoogle Scholar
  21. Cox A, Dunning AM, Garcia-Closas M, Balasubramanian S, Reed MW, Pooley KA, Scollen S, Baynes C, Ponder BA, Chanock S, Lissowska J, Brinton L, Peplonska B, Southey MC, Hopper JL, McCredie MR, Giles GG, Fletcher O, Johnson N, dos Santos Silva I, Gibson L, Bojesen SE, Nordestgaard BG, Axelsson CK, Torres D, Hamann U, Justenhoven C, Brauch H, Chang-Claude J, Kropp S, Risch A, Wang-Gohrke S, Schurmann P, Bogdanova N, Dork T, Fagerholm R, Aaltonen K, Blomqvist C, Nevanlinna H, Seal S, Renwick A, Stratton MR, Rahman N, Sangrajrang S, Hughes D, Odefrey F, Brennan P, Spurdle AB, Chenevix-Trench G, Beesley J, Mannermaa A, Hartikainen J, Kataja V, Kosma VM, Couch FJ, Olson JE, Goode EL (2007) A common coding variant in casp8 is associated with breast cancer risk. Nat Genet 39(3):352–358PubMedCrossRefGoogle Scholar
  22. Cutler DJ, Zwick ME, Carrasquillo MM, Yohn CT, Tobin KP, Kashuk C, Mathews DJ, Shah NA, Eichler EE, Warrington JA, Chakravarti A (2001) High-throughput variation detection and genotyping using microarrays. Genome Res 11(11):1913–1925PubMedGoogle Scholar
  23. Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM-algorithm. J R Stat Soc 39:1–38Google Scholar
  24. Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55(4):997–1004PubMedCrossRefGoogle Scholar
  25. Di X, Matsuzaki H, Webster TA, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G, Shen MM, Kulp D, Kennedy GC, Mei R, Jones KW, Cawley S (2005) Dynamic model based algorithms for screening and genotyping over 100 k snps on oligonucleotide microarrays. Bioinformatics 21(9):1958–1963PubMedCrossRefGoogle Scholar
  26. Dudbridge F, Gustano A (2008) Estimation of significance thresholds for genomewide association scans. Genet Epidemiol 32:227–234PubMedCrossRefGoogle Scholar
  27. Dudbridge F, Koeleman BP (2004) Efficient computation of significance levels for multiple associations in large studies of correlated data, including genomewide association studies. Am J Hum Genet 75(3):424–435PubMedCrossRefGoogle Scholar
  28. Dudbridge F, Gusnanto A, Koeleman BP (2006) Detecting multiple associations in genome-wide studies. Hum Genomics 2(5):310–317PubMedGoogle Scholar
  29. Easton DF, Eeles RA (2008) Genome-wide association studies in cancer. Hum Mol Genet 17(R2):R109–R115Google Scholar
  30. Excoffier L, Slatkin M (1995) Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol 12(5):921–927PubMedGoogle Scholar
  31. Foulkes WD (2008) Inherited susceptibility to common cancers. N Engl J Med 359(20):2143–2153PubMedCrossRefGoogle Scholar
  32. Freidlin B, Zheng G, Li Z, Gastwirth JL (2002) Trend tests for case-control studies of genetic markers: power, sample size and robustness. Hum Hered 53(3):146–152PubMedCrossRefGoogle Scholar
  33. Garcia-Closas M, Malats N, Silverman D, Dosemeci M, Kogevinas M, Hein DW, Tardon A, Serra C, Carrato A, Garcia-Closas R, Lloreta J, Castano-Vinyals G, Yeager M, Welch R, Chanock S, Chatterjee N, Wacholder S, Samanic C, Tora M, Fernandez F, Real FX, Rothman N (2005) Nat2 slow acetylation, gstm1 null genotype, and risk of bladder cancer: Results from the spanish bladder cancer study and meta-analyses. Lancet 366(9486):649–659PubMedCrossRefGoogle Scholar
  34. Gonzalez JR, Armengol L, Sole X, Guino E, Mercader JM, Estivill X, Moreno V (2007) Snpassoc: An r package to perform whole genome association studies. Bioinformatics 23(5):644–645PubMedCrossRefGoogle Scholar
  35. Gonzalez JR, Carrasco JL, Dudbridge F, Armengol L, Estivill X, Moreno V (2008) Maximizing association statistics over genetic models. Genet Epidemiol 32(3):246–254PubMedCrossRefGoogle Scholar
  36. Gorroochurn P, Heiman GA, Hodge SE, Greenberg DA (2006) Centralizing the non-central chi-square: A new method to correct for population stratification in genetic case–control association studies. Genet Epidemiol 30(4):277–289PubMedCrossRefGoogle Scholar
  37. Guo SW, Thompson EA (1992) Performing the exact test of Hardy–Weinberg proportion for multiple alleles. Biometrics 48(2):361–372PubMedCrossRefGoogle Scholar
  38. Haiman CA, Le Marchand L, Yamamato J, Stram DO, Sheng X, Kolonel LN, Wu AH, Reich D, Henderson BE (2007a) A common genetic risk factor for colorectal and prostate cancer. Nat Genet 39(8):954–956PubMedCrossRefGoogle Scholar
  39. Haiman CA, Patterson N, Freedman ML, Myers SR, Pike MC, Waliszewska A, Neubauer J, Tandon A, Schirmer C, McDonald GJ, Greenway SC, Stram DO, Le Marchand L, Kolonel LN, Frasco M, Wong D, Pooler LC, Ardlie K, Oakley-Girvan I, Whittemore AS, Cooney KA, John EM, Ingles SA, Altshuler D, Henderson BE, Reich D (2007b) Multiple regions within 8q24 independently affect risk for prostate cancer. Nat Genet 39(5):638–644PubMedCrossRefGoogle Scholar
  40. Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6(2):95–108PubMedCrossRefGoogle Scholar
  41. Hoh J, Ott J (2003) Mathematical multi-locus approaches to localizing complex human trait genes. Nat Rev Genet 4:701–709PubMedCrossRefGoogle Scholar
  42. Hong H, Su Z, Ge W, Shi L, Perkins R, Fang H, Xu J, Chen JJ, Han T, Kaput J, Fuscoe JC, Tong W (2008) Assessing batch effects of genotype calling algorithm brlmm for the affymetrix genechip human mapping 500 k array set using 270 hapmap samples. BMC Bioinformatics 9(Suppl 9):S17PubMedCrossRefGoogle Scholar
  43. Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang Z, Welch R, Hutchinson A, Wang J, Yu K, Chatterjee N, Orr N, Willett WC, Colditz GA, Ziegler RG, Berg CD, Buys SS, McCarty CA, Feigelson HS, Calle EE, Thun MJ, Hayes RB, Tucker M, Gerhard DS, Fraumeni J J F, Hoover RN, Thomas G, Chanock SJ (2007) A genome-wide association study identifies alleles in fgfr2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 39(7):870–874PubMedCrossRefGoogle Scholar
  44. Iniesta R, Moreno V (2008) Assessment of genetic association using haplotypes inferred with uncertainty via markov chain monte carlo. In: Keller A, Heinrich S, Niederreiter H (eds) Monte Carlo and Quasi-Monte Carlo Methods 2006. Springer, New York, pp 529–535CrossRefGoogle Scholar
  45. International HapMap Consortium (2003) The international hapmap project. Nature 426(6968): 789–796CrossRefGoogle Scholar
  46. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264PubMedCrossRefGoogle Scholar
  47. Jaeger E, Webb E, Howarth K, Carvajal-Carmona L, Rowan A, Broderick P, Walther A, Spain S, Pittman A, Kemp Z, Sullivan K, Heinimann K, Lubbe S, Domingo E, Barclay E, Martin L, Gorman M, Chandler I, Vijayakrishnan J, Wood W, Papaemmanuil E, Penegar S, Qureshi M, Farrington S, Tenesa A, Cazier JB, Kerr D, Gray R, Peto J, Dunlop M, Campbell H, Thomas H, Houlston R, Tomlinson I (2008) Common genetic variants at the crac1 (hmps) locus on chromosome 15q13.3 influence colorectal cancer risk. Nat Genet 40(1):26–28Google Scholar
  48. Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins PJ, Darvishi K, Lee C, Nizzari MM, Gabriel SB, Purcell S, Daly MJ, Altshuler D (2008) Integrated genotype calling and association analysis of snps, common copy number polymorphisms and rare cnvs. Nat Genet 40(10):1253–1260PubMedCrossRefGoogle Scholar
  49. Landi S, Gemignani F, Moreno V, Gioia-Patricola L, Chabrier A, Guino E, Navarro M, de Oca J, Capella G, Canzian F (2005) A comprehensive analysis of phase i and phase ii metabolism gene polymorphisms and risk of colorectal cancer. Pharmacogenet Genomics 15(8):535–546PubMedCrossRefGoogle Scholar
  50. Langholz B, Rothman N, Wacholder S, Thomas D (1999) Cohort studies for characterizing measured genes. Monogr Natl Cancer Inst 26:39–42PubMedGoogle Scholar
  51. Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, Pukkala E, Skytthe A, Hemminki K (2000) Environmental and heritable factors in the causation of cancer – analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med 343(2):78–85PubMedCrossRefGoogle Scholar
  52. Lin S, Carvalho B, Cutler DJ, Arking DE, Chakravarti A, Irizarry RA (2008) Validation and extension of an empirical bayes method for snp calling on affymetrix microarrays. Genome Biol 9(4):R63PubMedCrossRefGoogle Scholar
  53. Liu WM, Di X, Yang G, Matsuzaki H, Huang J, Mei R, Ryder TB, Webster TA, Dong S, Liu G, Jones KW, Kennedy GC, Kulp D (2003) Algorithms for large-scale genotyping microarrays. Bioinformatics 19(18):2397–2403PubMedCrossRefGoogle Scholar
  54. Manly KF, Nettleton D, Hwang JT (2004) Genomics, prior probability, and statistical tests of multiple hypotheses. Genome Res 14(6):997–1001PubMedCrossRefGoogle Scholar
  55. McKay JD, Hung RJ, Gaborieau V, Boffetta P, Chabrier A, Byrnes G, Zaridze D, Mukeria A, Szeszenia-Dabrowska N, Lissowska J, Rudnai P, Fabianova E, Mates D, Bencko V, Foretova L, Janout V, McLaughlin J, Shepherd F, Montpetit A, Narod S, Krokan HE, Skorpen F, Elvestad MB, Vatten L, Njolstad I, Axelsson T, Chen C, Goodman G, Barnett M, Loomis MM, Lubinski J, Matyjasik J, Lener M, Oszutowska D, Field J, Liloglou T, Xinarianos G, Cassidy A, Vineis P, Clavel-Chapelon F, Palli D, Tumino R, Krogh V, Panico S, Gonzalez CA, Ramon Quiros J, Martinez C, Navarro C, Ardanaz E, Larranaga N, Kham KT, Key T, Bueno-de Mesquita HB, Peeters PH, Trichopoulou A, Linseisen J, Boeing H, Hallmans G, Overvad K, Tjonneland A, Kumle M, Riboli E, Zelenika D, Boland A, Delepine M, Foglio M, Lechner D, Matsuda F, Blanche H, Gut I, Heath S, Lathrop M, Brennan P (2008) Lung cancer susceptibility locus at 5p15.33. Nat Genet 40(12):1404–1406Google Scholar
  56. Moreno V, Gemignani F, Landi S, Gioia-Patricola L, Chabrier A, Blanco I, Gonzalez S, Guino E, Capella G, Canzian F (2006) Polymorphisms in genes of nucleotide and base excision repair: Risk and prognosis of colorectal cancer. Clin Cancer Res 12(7 Pt 1):2101–2108PubMedCrossRefGoogle Scholar
  57. Mukherjee B, Chatterjee N (2008) Exploiting gene–environment independence for analysis of case-control studies: An empirical bayes-type shrinkage estimator to trade-off between bias and efficiency. Biometrics 64(3):685–694PubMedCrossRefGoogle Scholar
  58. Niu T, Qin ZS, Xu X, Liu JS (2002) Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet 70(1):157–169PubMedCrossRefGoogle Scholar
  59. O’Donovan MC, Norton N, Williams H, Peirce T, Moskvina V, Nikolov I, Hamshere M, Carroll L, Georgieva L, Dwyer S, Holmans P, Marchini JL, Spencer CC, Howie B, Leung HT, Giegling I, Hartmann AM, Moller HJ, Morris DW, Shi Y, Feng G, Hoffmann P, Propping P, Vasilescu C, Maier W, Rietschel M, Zammit S, Schumacher J, Quinn EM, Schulze TG, Iwata N, Ikeda M, Darvasi A, Shifman S, He L, Duan J, Sanders AR, Levinson DF, Adolfsson R, Osby U, Terenius L, Jonsson EG, Cichon S, Nothen MM, Gill M, Corvin AP, Rujescu D, Gejman PV, Kirov G, Craddock N, Williams NM, Owen MJ (2009) Analysis of 10 independent samples provides evidence for association between schizophrenia and a snp flanking fibroblast growth factor receptor 2. Mol Psychiatry 14(1):30–36PubMedCrossRefGoogle Scholar
  60. Pahl R, Schafer H, Muller HH (2009) Optimal multistage designs – A general framework for efficient genome-wide association studies. Biostatistics 10(2):297–309PubMedCrossRefGoogle Scholar
  61. Pankratz N, Wilk JB, Latourelle JC, DeStefano AL, Halter C, Pugh EW, Doheny KF, Gusella JF, Nichols WC, Foroud T, Myers RH (2009) Genomewide association study for susceptibility genes contributing to familial parkinson disease. Hum Genet 124(6):593–605PubMedCrossRefGoogle Scholar
  62. Piegorsch WW, Weinberg CR, Taylor JA (1994) Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. Stat Med 13(2):153–162PubMedCrossRefGoogle Scholar
  63. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8):904–909PubMedCrossRefGoogle Scholar
  64. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959PubMedGoogle Scholar
  65. Purcell S, Cherny SS, Sham PC (2003) Genetic power calculator: Design of linkage and association genetic mapping studies of complex traits. Bioinformatics 19(1):149–150PubMedCrossRefGoogle Scholar
  66. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC (2007) Plink: A tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575PubMedCrossRefGoogle Scholar
  67. Qin ZS, Niu T, Liu JS (2002) Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am J Hum Genet 71(5):1242–1247PubMedCrossRefGoogle Scholar
  68. Rabbee N, Speed TP (2006) A genotype calling algorithm for affymetrix snp arrays. Bioinformatics 22(1):7–12PubMedCrossRefGoogle Scholar
  69. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W, Cho EK, Dallaire S, Freeman JL, Gonzalez JR, Gratacos M, Huang J, Kalaitzopoulos D, Komura D, MacDonald JR, Marshall CR, Mei R, Montgomery L, Nishimura K, Okamura K, Shen F, Somerville MJ, Tchinda J, Valsesia A, Woodwark C, Yang F, Zhang J, Zerjal T, Armengol L, Conrad DF, Estivill X, Tyler-Smith C, Carter NP, Aburatani H, Lee C, Jones KW, Scherer SW, Hurles ME (2006) Global variation in copy number in the human genome. Nature 444(7118):444–454PubMedCrossRefGoogle Scholar
  70. Sabatti C, Service SK, Hartikainen AL, Pouta A, Ripatti S, Brodsky J, Jones CG, Zaitlen NA, Varilo T, Kaakinen M, Sovio U, Ruokonen A, Laitinen J, Jakkula E, Coin L, Hoggart C, Collins A, Turunen H, Gabriel S, Elliot P, McCarthy MI, Daly MJ, Jarvelin MR, Freimer NB, Peltonen L (2009) Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat Genet 41(1):35–46CrossRefGoogle Scholar
  71. Schaid DJ (2004) Evaluating associations of haplotypes with traits. Genet Epidemiol 27(4): 348–364PubMedCrossRefGoogle Scholar
  72. Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA (2002) Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet 70(2): 425–434PubMedCrossRefGoogle Scholar
  73. Schaid DJ, McDonnell SK, Hebbring SJ, Cunningham JM, Thibodeau SN (2005) Nonparametric tests of association of multiple genes with human disease. Am J Hum Genet 76(5):780–793PubMedCrossRefGoogle Scholar
  74. Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78(4):629–644PubMedCrossRefGoogle Scholar
  75. Schork NJ (2002) Power calculations for genetic association studies using estimated probability distributions. Am J Hum Genet 70(6):1480–1489PubMedCrossRefGoogle Scholar
  76. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: The NCBI database of genetic variation. Nucleic Acids Res 29(1):308–311PubMedCrossRefGoogle Scholar
  77. Shlien A, Tabori U, Marshall CR, Pienkowska M, Feuk L, Novokmet A, Nanda S, Druker H, Scherer SW, Malkin D (2008) Excessive genomic DNA copy number variation in the Li–Fraumeni cancer predisposition syndrome. Proc Natl Acad Sci USA 105(32):11264–11269PubMedCrossRefGoogle Scholar
  78. Skol AD, Scott LJ, Abecasis GR, Boehnke M (2006) Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet 38(2):209–213PubMedCrossRefGoogle Scholar
  79. Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D, Boutin P, Vincent D, Belisle A, Hadjadj S, Balkau B, Heude B, Charpentier G, Hudson TJ, Montpetit A, Pshezhetsky AV, Prentki M, Posner BI, Balding DJ, Meyre D, Polychronakos C, Froguel P (2007) A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445(7130):881–885PubMedCrossRefGoogle Scholar
  80. Slager SL, Schaid DJ (2001) Case–control studies of genetic markers: Power and sample size approximations for armitage’s test for trend. Hum Hered 52(3):149–153PubMedCrossRefGoogle Scholar
  81. Sole X, Guino E, Valls J, Iniesta R, Moreno V (2006) SNPStats: A web tool for the analysis of association studies. Bioinformatics 22(15):1928–1929PubMedCrossRefGoogle Scholar
  82. Sole X, Hernandez P, de Heredia ML, Armengol L, Rodriguez-Santiago B, Gomez L, Maxwell CA, Aguilo F, Condom E, Abril J, Perez-Jurado L, Estivill X, Nunes V, Capella G, Gruber SB, Moreno V, Pujana MA (2008) Genetic and genomic analysis modeling of germline c-myc overexpression and cancer susceptibility. BMC Genomics 9:12PubMedCrossRefGoogle Scholar
  83. Spielman RS, McGinnis RE, Ewens WJ (1993) Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (iddm). Am J Hum Genet 52(3):506–516PubMedGoogle Scholar
  84. Spruill SE, Lu J, Hardy S, Weir B (2002) Assessing sources of variability in microarray gene expression data. Biotechniques 33(4):916–920, 922–923Google Scholar
  85. Stephens M, Donnelly P (2003) A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet 73(5):1162–1169PubMedCrossRefGoogle Scholar
  86. Stephens M, Scheet P (2005) Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am J Hum Genet 76(3):449–462PubMedCrossRefGoogle Scholar
  87. Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68(4):978–989PubMedCrossRefGoogle Scholar
  88. Sun J, Zheng SL, Wiklund F, Isaacs SD, Purcell LD, Gao Z, Hsu FC, Kim ST, Liu W, Zhu Y, Stattin P, Adami HO, Wiley KE, Dimitrov L, Li T, Turner AR, Adams TS, Adolfsson J, Johansson JE, Lowey J, Trock BJ, Partin AW, Walsh PC, Trent JM, Duggan D, Carpten J, Chang BL, Gronberg H, Isaacs WB, Xu J (2008) Evidence for two independent prostate cancer risk-associated loci in the hnf1b gene at 17q12. Nat Genet 40(10):1153–1155PubMedCrossRefGoogle Scholar
  89. Tanck MW, Klerkx AH, Jukema JW, De Knijff P, Kastelein JJ, Zwinderman AH (2003) Estimation of multilocus haplotype effects using weighted penalised log-likelihood: Analysis of five sequence variations at the cholesteryl ester transfer protein gene locus. Ann Hum Genet 67(Pt 2):175–184PubMedCrossRefGoogle Scholar
  90. Tenesa A, Farrington SM, Prendergast JG, Porteous ME, Walker M, Haq N, Barnetson RA, Theodoratou E, Cetnarskyj R, Cartwright N, Semple C, Clark AJ, Reid FJ, Smith LA, Kavoussanakis K, Koessler T, Pharoah PD, Buch S, Schafmayer C, Tepel J, Schreiber S, Volzke H, Schmidt CO, Hampe J, Chang-Claude J, Hoffmeister M, Brenner H, Wilkening S, Canzian F, Capella G, Moreno V, Deary IJ, Starr JM, Tomlinson IP, Kemp Z, Howarth K, Carvajal-Carmona L, Webb E, Broderick P, Vijayakrishnan J, Houlston RS, Rennert G, Ballinger D, Rozek L, Gruber SB, Matsuda K, Kidokoro T, Nakamura Y, Zanke BW, Greenwood CM, Rangrej J, Kustra R, Montpetit A, Hudson TJ, Gallinger S, Campbell H, Dunlop MG (2008) Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21. Nat Genet 40(5):631–637PubMedCrossRefGoogle Scholar
  91. Thorisson GA, Smith AV, Krishnan L, Stein LD (2005) The international hapmap project web site. Genome Res 15(11):1592–1593PubMedCrossRefGoogle Scholar
  92. Tregouet DA, Garelle V (2007) A new java interface implementation of thesias: testing haplotype effects in association studies. Bioinformatics 23(8):1038–1039PubMedCrossRefGoogle Scholar
  93. Tregouet DA, Tiret L (2004) Cox proportional hazards survival regression in haplotype-based association analysis using the stochastic-em algorithm. Eur J Hum Genet 12(11):971–974PubMedCrossRefGoogle Scholar
  94. Tregouet DA, Escolano S, Tiret L, Mallet A, Golmard JL (2004) A new algorithm for haplotype-based association analysis: The stochastic-EM algorithm. Ann Hum Genet 68(Pt 2):165–177PubMedCrossRefGoogle Scholar
  95. Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N (2004) Assessing the probability that a positive report is false: An approach for molecular epidemiology studies. J Natl Cancer Inst 96(6):434–442PubMedCrossRefGoogle Scholar
  96. Waldron ER, Whittaker JC, Balding DJ (2006) Fine mapping of disease genes via haplotype clustering. Genet Epidemiol 30(2):170–179PubMedCrossRefGoogle Scholar
  97. Wang Y, Broderick P, Webb E, Wu X, Vijayakrishnan J, Matakidou A, Qureshi M, Dong Q, Gu X, Chen WV, Spitz MR, Eisen T, Amos CI, Houlston RS (2008) Common 5p15.33 and 6p21.33 variants influence lung cancer risk. Nat Genet 40(12):1407–1409Google Scholar
  98. Weidinger S, Gieger C, Rodriguez E, Baurecht H, Mempel M, Klopp N, Gohlke H, Wagenpfeil S, Ollert M, Ring J, Behrendt H, Heinrich J, Novak N, Bieber T, Kramer U, Berdel D, von Berg A, Bauer CP, Herbarth O, Koletzko S, Prokisch H, Mehta D, Meitinger T, Depner M, von Mutius E, Liang L, Moffatt M, Cookson W, Kabesch M, Wichmann HE, Illig T (2008) Genome-wide scan on total serum IgE levels identifies fcer1a as novel susceptibility locus. PLoS Genet 4(8):e1000166PubMedCrossRefGoogle Scholar
  99. Welch WJ (1990) Construction of permutation tests. J Am Stat Assoc 85:693–698CrossRefGoogle Scholar
  100. Wigginton JE, Cutler DJ, Abecasis GR (2005) A note on exact tests of Hardy–Weinberg equilibrium. Am J Hum Genet 76(5):887–893PubMedCrossRefGoogle Scholar
  101. Witte JS (2007) Multiple prostate cancer risk variants on 8q24. Nat Genet 39(5):579–580PubMedCrossRefGoogle Scholar
  102. Yeager M, Orr N, Hayes RB, Jacobs KB, Kraft P, Wacholder S, Minichiello MJ, Fearnhead P, Yu K, Chatterjee N, Wang Z, Welch R, Staats BJ, Calle EE, Feigelson HS, Thun MJ, Rodriguez C, Albanes D, Virtamo J, Weinstein S, Schumacher FR, Giovannucci E, Willett WC, Cancel-Tassin G, Cussenot O, Valeri A, Andriole GL, Gelmann EP, Tucker M, Gerhard DS, Fraumeni J J F, Hoover R, Hunter DJ, Chanock SJ, Thomas G (2007) Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 39(5):645–649PubMedCrossRefGoogle Scholar
  103. Yeh CC, Santella RM, Hsieh LL, Sung FC, Tang R (2009) An intron 4 VNTR polymorphism of the endothelial nitric oxide synthase gene is associated with early-onset colorectal cancer. Int J Cancer 124(7):1565–1571PubMedCrossRefGoogle Scholar
  104. Zhao J (2007) Gap: Genetic analysis package. J Stat Soft 23(8)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Xavier Solé
    • 1
  • Juan Ramón González
  • Víctor Moreno
  1. 1.Biostatistics and Bioinformatics UnitCatalan Institute of Oncology – IDIBELLBarcelonaSpain

Personalised recommendations