, Volume 57, Issue 8, pp 549–558

Reliability of statistical associations between genes and disease



Many statistical associations between a disease and alleles of specific genes have proven to be irreproducible. In part, this irreproducibility can be attributed to a lack of replication before publication and the fact that, until recently, the relationship between statistical significance and various measures of reproducibility was not widely understood. This review proposes a classification system, the Better Associations for Disease and GEnes (BADGE) system, for describing genetic associations. The BADGE classes, first class through fifth class, are based on the P value of the association. A first-class association, with P<2×10−7, is expected to be reproducible even in the absence of other evidence supporting the association. A fifth-class association corresponds to conventional statistical significance (P<5×10−2), which provides almost no assurance of reproducibility. Three intervening classes, described as second-, third-, and fourth-class associations, are defined by P values separated by factors of 20 or 25 from these extremes.


  1. Altman DG, Bland JM (1994) Diagnostic tests 2: predictive values. BMJ 309:102PubMedGoogle Scholar
  2. Arias AI, Giles B, Eiermann TH, Sterry W, Pandey JP (1997) Tumor necrosis factor-alpha gene polymorphism in psoriasis. Exp Clin Immunogenet 14:118–122PubMedGoogle Scholar
  3. Barratt BJ, Payne F, Lowe CE, Hermann R, Healy BC, Harold D, Concannon P, Gharani N, McCarthy MI, Olavesen MG, McCormack R, Guja C, Ionescu-Tirgoviste C, Undlien DE, Ronningen KS, Gillespie KM, Tuomilehto-Wolf E, Tuomilehto J, Bennett ST, Clayton DG, Cordell HJ, Todd JA (2004) Remapping the insulin gene/IDDM2 locus in type 1 diabetes. Diabetes 53:1884–1889PubMedCrossRefGoogle Scholar
  4. Becker KG, Barnes KC, Bright TJ, Wang SA (2004) The genetic association database. Nat Genet 36:1–2CrossRefPubMedGoogle Scholar
  5. Begovich AB, Carlton VE, Honigberg LA, Schrodi SJ, Chokkalingam AP, Alexander HC, Ardlie KG, Huang Q, Smith AM, Spoerke JM, Conn MT, Chang M, Chang SY, Saiki RK, Catanese JJ, Leong DU, Garcia VE, McAllister LB, Jeffery DA, Lee AT, Batliwalla F, Remmers E, Criswell LA, Seldin MF, Kastner DL, Amos CI, Sninsky JJ, Gregersen PK (2004) A missense single-nucleotide polymorphism in a gene encoding a protein tyrosine phosphatase (PTPN22) is associated with rheumatoid arthritis. Am J Hum Genet 75:330–337CrossRefPubMedGoogle Scholar
  6. Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate: a practical and powerful approach to multiple testing. J R Stat Soc, B 57:289–300Google Scholar
  7. Bleumink GS, Schut AF, Sturkenboom MC, Deckers JW, van Duijn CM, Stricker BH (2004) Genetic polymorphisms and heart failure. Genet Med 6:465–474PubMedGoogle Scholar
  8. Cardon LR, Bell JI (2001) Association study design for complex diseases. Nat Rev, Genet 2:91–99CrossRefGoogle Scholar
  9. CHEK2 Breast Cancer Case–Control Consortium (2004) CHEK2*1100delC and susceptibility to breast cancer: a collaborative analysis involving 10,860 breast cancer cases and 9,065 controls from 10 studies. Am J Hum Genet 74:1175–1182CrossRefGoogle Scholar
  10. Chung WH, Hung SI, Hong HS, Hsih MS, Yang LC, Ho HC, Wu JY, Chen YT (2004) Medical genetics: a marker for Stevens–Johnson syndrome. Nature 428:486CrossRefPubMedGoogle Scholar
  11. Colhoun HM, McKeigue PM, Davey Smith G (2003) Problems of reporting genetic associations with complex outcomes. Lancet 361:865–872CrossRefPubMedGoogle Scholar
  12. Crawford DC, Carlson CS, Rieder MJ, Carrington DP, Yi Q, Smith JD, Eberle MA, Kruglyak L, Nickerson DA (2004) Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. Am J Hum Genet 74:610–622CrossRefPubMedGoogle Scholar
  13. Dahlman I, Eaves IA, Kosoy R, Morrison VA, Heward J, Gough SC, Allahabadia A, Franklyn JA, Tuomilehto J, Tuomilehto-Wolf E, Cucca F, Guja C, Ionescu-Tirgoviste C, Stevens H, Carr P, Nutland S, McKinney P, Shield JP, Wang W, Cordell HJ, Walker N, Todd JA, Concannon P (2002) Parameters for reliable results in genetic association studies in common disease. Nat Genet 30:149–150CrossRefPubMedGoogle Scholar
  14. de Silva R, Hope A, Pittman A, Weale ME, Morris HR, Wood NW, Lees AJ (2003) Strong association of the Saitohin gene Q7 variant with progressive supranuclear palsy. Neurology 61:407–409PubMedGoogle Scholar
  15. Dorak MT, Burnett AK, Worwood M, Sproul AM, Gibson BE (1999) The C282Y mutation of HFE is another male-specific risk factor for childhood acute lymphoblastic leukemia. Blood 94:3957PubMedGoogle Scholar
  16. Dudbridge F, Koeleman BP (2004) Efficient computation of significance levels for multiple associations in large studies of correlated data, including genomewide association studies. Am J Hum Genet 75:424–435CrossRefPubMedGoogle Scholar
  17. Evans DM, Cardon LR (2005) A comparison of linkage disequilibrium patterns and estimated population recombination rates across multiple populations. Am J Hum Genet 76:681–687CrossRefPubMedGoogle Scholar
  18. Fernando RL, Nettleton D, Southey BR, Dekkers JC, Rothschild MF, Soller M (2004) Controlling the proportion of false positives in multiple dependent tests. Genetics 166:611–619CrossRefPubMedGoogle Scholar
  19. Florez JC, Sjogren M, Burtt N, Orho-Melander M, Schayer S, Sun M, Almgren P, Lindblad U, Tuomi T, Gaudet D, Hudson TJ, Daly MJ, Ardlie KG, Hirschhorn JN, Altshuler D, Groop L (2004) Association testing in 9,000 people fails to confirm the association of the insulin receptor substrate-1 G972R polymorphism with type 2 diabetes. Diabetes 53:3313–3318PubMedCrossRefGoogle Scholar
  20. Freedman ML, Reich D, Penney KL, McDonald GJ, Mignault AA, Patterson N, Gabriel SB, Topol EJ, Smoller JW, Pato CN, Pato MT, Petryshen TL, Kolonel LN, Lander ES, Sklar P, Henderson B, Hirschhorn JN, Altshuler D (2004) Assessing the impact of population stratification on genetic association studies. Nat Genet 36:388–393CrossRefPubMedGoogle Scholar
  21. Freedman ML, Pearce CL, Penney KL, Hirschhorn JN, Kolonel LN, Henderson BE, Altshuler D (2005) Systematic evaluation of genetic variation at the androgen receptor locus and risk of prostate cancer in a multiethnic cohort study. Am J Hum Genet 76:82–90CrossRefPubMedGoogle Scholar
  22. Freimer N, Sabatti C (2004) The use of pedigree, sib-pair and association studies of common diseases for genetic mapping and epidemiology. Nat Genet 36:1045–1051CrossRefPubMedGoogle Scholar
  23. Gehlbach SH (2002) Interpreting the medical literature. McGraw-Hill, New YorkGoogle Scholar
  24. Gottenberg JE, Busson M, Loiseau P, Cohen-Solal J, Lepage V, Charron D, Sibilia J, Mariette X (2003) In primary Sjogren's syndrome, HLA class II is associated exclusively with autoantibody production and spreading of the autoimmune response. Arthritis Rheum 48:2240–2245CrossRefPubMedGoogle Scholar
  25. Graves PE, Kabesch M, Halonen M, Holberg CJ, Baldini M, Fritzsch C, Weiland SK, Erickson RP, von Mutius E, Martinez FD (2000) A cluster of seven tightly linked polymorphisms in the IL-13 gene is associated with total serum IgE levels in three populations of white children. J Allergy Clin Immunol 105:506–513PubMedCrossRefGoogle Scholar
  26. Guo D, Li M, Zhang Y, Yang P, Eckenrode S, Hopkins D, Zheng W, Purohit S, Podolsky RH, Muir A, Wang J, Dong Z, Brusko T, Atkinson M, Pozzilli P, Zeidler A, Raffel LJ, Jacob CO, Park Y, Serrano-Rios M, Larrad MT, Zhang Z, Garchon HJ, Bach JF, Rotter JI, She JX, Wang CY (2004) A functional variant of SUMO4, a new I kappa B alpha modifier, is associated with type 1 diabetes. Nat Genet 36:837–841CrossRefPubMedGoogle Scholar
  27. Helgadottir A, Manolescu A, Thorleifsson G, Gretarsdottir S, Jonsdottir H, Thorsteinsdottir U, Samani NJ, Gudmundsson G, Grant SF, Thorgeirsson G, Sveinbjornsdottir S, Valdimarsson EM, Matthiasson SE, Johannsson H, Gudmundsdottir O, Gurney ME, Sainz J, Thorhallsdottir M, Andresdottir M, Frigge ML, Topol EJ, Kong A, Gudnason V, Hakonarson H, Gulcher JR, Stefansson K (2004) The gene encoding 5-lipoxygenase activating protein confers risk of myocardial infarction and stroke. Nat Genet 36:233–239CrossRefPubMedGoogle Scholar
  28. Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev, Genet 6:95–108CrossRefGoogle Scholar
  29. Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K (2002) A comprehensive review of genetic association studies. Genet Med 4:45–61PubMedGoogle Scholar
  30. Ioannidis JP, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG (2001) Replication validity of genetic association studies. Nat Genet 29:306–309CrossRefPubMedGoogle Scholar
  31. Jawaheer D, Li W, Graham RR, Chen W, Damle A, Xiao X, Monteiro J, Khalili H, Lee A, Lundsten R, Begovich A, Bugawan T, Erlich H, Elder JT, Criswell LA, Seldin MF, Amos CI, Behrens TW, Gregersen PK (2002) Dissecting the genetic complexity of the association between human leukocyte antigens and rheumatoid arthritis. Am J Hum Genet 71:585–594CrossRefPubMedGoogle Scholar
  32. Khoury MJ, Newill CA, Chase GA (1985) Epidemiologic evaluation of screening for risk factors: application to genetic screening. Am J Public Health 75:1204–1208PubMedCrossRefGoogle Scholar
  33. Kim UK, Jorgenson E, Coon H, Leppert M, Risch N, Drayna D (2003) Positional cloning of the human quantitative trait locus underlying taste sensitivity to phenylthiocarbamide. Science 299:1221–1225CrossRefPubMedGoogle Scholar
  34. Kroese M, Zimmern RL, Sanderson S (2004) Genetic tests and their evaluation: can we answer the key questions? Genet Med 6:475–480PubMedCrossRefGoogle Scholar
  35. Kruglyak L (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet 22:139–144CrossRefPubMedGoogle Scholar
  36. Lander E, Kruglyak L (1995) Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet 11:241–247CrossRefPubMedGoogle Scholar
  37. Lin RC, Wang XL, Morris BJ (2003) Association of coronary artery disease with glucocorticoid receptor N363S variant. Hypertension 41:404–407CrossRefPubMedGoogle Scholar
  38. Lin S, Chakravarti A, Cutler DJ (2004) Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies. Nat Genet 36:1181–1188CrossRefPubMedGoogle Scholar
  39. Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN (2003) Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet 33:177–182CrossRefPubMedGoogle Scholar
  40. Maier LM, Chapman J, Howson JMM, Clayton DG, Pask R, Strachan DP, McArdle WL, Twells RCJ, Todd JA (2005) No evidence of association or interaction between the IL4RA, IL4, and IL13 genes in type 1 diabetes. Am J Hum Genet 76:517–521CrossRefPubMedGoogle Scholar
  41. Manly KF, Nettleton D, Hwang JT (2004) Genomics, prior probability, and statistical tests of multiple hypotheses. Genome Res 14:997–1001CrossRefPubMedGoogle Scholar
  42. Morton NE (1955) Sequential tests for the detection of linkage. Am J Human Genet 7:277–318Google Scholar
  43. Morton NE (1998) Significance levels in complex inheritance. Am J Hum Genet 62:690–697CrossRefPubMedGoogle Scholar
  44. Neale BM, Sham PC (2004) The future of association studies: gene-based analysis and replication. Am J Hum Genet 75:353–362CrossRefPubMedGoogle Scholar
  45. Ntais C, Polycarpou A, Ioannidis JP (2003a) Association of the CYP17 gene polymorphism with the risk of prostate cancer: a meta-analysis. Cancer Epidemiol Biomark Prev 12:120–126Google Scholar
  46. Ntais C, Polycarpou A, Ioannidis JP (2003b) SRD5A2 gene polymorphisms and the risk of prostate cancer: a meta-analysis. Cancer Epidemiol Biomark Prev 12:618–624Google Scholar
  47. Ntais C, Polycarpou A, Ioannidis JP (2003c) Vitamin D receptor gene polymorphisms and risk of prostate cancer: a meta-analysis. Cancer Epidemiol Biomark Prev 12:1395–1402Google Scholar
  48. Ransohoff DF (2004) Rules of evidence for cancer molecular-marker discovery and validation. Nat Rev, Cancer 4:309–314CrossRefGoogle Scholar
  49. Risch N, Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273:1516–1517PubMedCrossRefGoogle Scholar
  50. Romero R, Kuivaniemi H, Tromp G, Olson JM (2002) The design, execution, and interpretation of genetic association studies to decipher complex diseases. Am J Obstet Gynecol 187:1299–1312CrossRefPubMedGoogle Scholar
  51. Shankarkumar U (2004) HLA associations in leprosy patients from Mumbai, India. Lepr Rev 75:79–85PubMedGoogle Scholar
  52. Shankarkumar U, Thakar M, Mehendale S, Praranjape RS, Mohanty D (2003) Association of HLA B*3520, B*1801, and Cw*1507 with HIV-1 infection Maharashtra, India. J Acquir Immune Defic Syndr 34:113–114CrossRefPubMedGoogle Scholar
  53. Shifman S, Bronstein M, Sternfeld M, Pisante-Shalom A, Lev-Lehman E, Weizman A, Reznik I, Spivak B, Grisaru N, Karp L, Schiffer R, Kotler M, Strous RD, Swartz-Vanetik M, Knobler HY, Shinar E, Beckmann JS, Yakir B, Risch N, Zak NB, Darvasi A (2002) A highly significant association between a COMT haplotype and schizophrenia. Am J Hum Genet 71:1296–1302CrossRefPubMedGoogle Scholar
  54. Spielman RS, McGinnis RE, Ewens WJ (1993) Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 52:506–516PubMedGoogle Scholar
  55. Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc, B 64:479–498CrossRefGoogle Scholar
  56. Storey JD (2003) The positive False Discovery Rate: a Bayesian interpretation and the Q-value. Ann Stat 31:2013–2035CrossRefGoogle Scholar
  57. Takeoka S, Unoki M, Onouchi Y, Doi S, Fujiwara H, Miyatake A, Fujita K, Inoue I, Nakamura Y, Tamari M (2001) Amino-acid substitutions in the IKAP gene product significantly increase risk for bronchial asthma in children. J Hum Genet 46:57–63CrossRefPubMedGoogle Scholar
  58. Thomas DC, Clayton DG (2004) Betting odds and genetic associations. J Natl Cancer Inst 96:421–423PubMedGoogle Scholar
  59. Ueda H, Howson JM, Esposito L, Heward J, Snook H, Chamberlain G, Rainbow DB, Hunter KM, Smith AN, Di Genova G, Herr MH, Dahlman I, Payne F, Smyth D, Lowe C, Twells RC, Howlett S, Healy B, Nutland S, Rance HE, Everett V, Smink LJ, Lam AC, Cordell HJ, Walker NM, Bordin C, Hulme J, Motzo C, Cucca F, Hess JF, Metzker ML, Rogers J, Gregory S, Allahabadia A, Nithiyananthan R, Tuomilehto-Wolf E, Tuomilehto J, Bingley P, Gillespie KM, Undlien DE, Ronningen KS, Guja C, Ionescu-Tirgoviste C, Savage DA, Maxwell AP, Carson DJ, Patterson CC, Franklyn JA, Clayton DG, Peterson LB, Wicker LS, Todd JA, Gough SC (2003) Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423:506–511CrossRefPubMedGoogle Scholar
  60. Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N (2004) Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst 96:434–442PubMedCrossRefGoogle Scholar
  61. Wang WY, Barratt BJ, Clayton DG, Todd JA (2005) Genome-wide association studies: theoretical and practical concerns. Nat Rev, Genet 6:109–118CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2005

Authors and Affiliations

  1. 1.Pathology and Laboratory Medicine, Anatomy and Neurobiology, Center of Genomics and BioinformaticsUniversity of Tennessee Health Science CenterMemphisUSA
  2. 2.BiostatisticsUniversity at BuffaloBuffaloUSA

Personalised recommendations