Immunogenetics pp 245-266 | Cite as

Analytical Methods for Disease Association Studies with Immunogenetic Data

  • Jill A. HollenbachEmail author
  • Steven J. Mack
  • Glenys Thomson
  • Pierre-Antoine Gourraud
Part of the Methods in Molecular Biology book series (MIMB, volume 882)


Disease association studies involving highly polymorphic immunogenetic data may involve analyses at one or many units of analysis, including amino acid, allele, genotype and haplotype levels, as well as consideration of gene–gene or gene–environment interactions. The selection of the appropriate statistical tests is critical and will be dependent on the nature of the dataset (e.g., case-control vs. family data) as well as the specific research hypotheses being tested. This paper describes the various study and analysis categories used for such analyses, including the advantages and limitations of such techniques.

Key words

HLA KIR Immunogenetic Data analysis Disease association Case-control Family 



This work was supported by National Institutes of Health (NIH) grants U01AI067068 (JAH, SJM) and U19 AI067152 (PAG) awarded by the National Institute of Allergy and Infectious Diseases (NIAID) and 1R01DK061722 (JAH) awarded by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.


Bonferroni correction

The most commonly used method of correcting for multiple comparisons. Generally, the test significance level is divided by the number of comparisons made in the study, thereby increasing the overall stringency of the significance testing.

Confidence interval

A likely range of values for an estimate of a particular parameter within a particular level of significance.

Contingency table

Also known as a cross-tabulation. A 2  ×  n table used to analyze heterogeneity between two sets of observations of two or more categorical variables.

Genetic association

The occurrence within a population, greater than that expected by chance, of a genetic trait with a particular phenotype.

Logistic regression

A statistical method to determine which in a set of independent variables has a predictive relationship to a binary-dependent outcome variable.

Multiple comparisons

Also know as multiple testing. Performing a statistical test multiple times in the same analysis, thereby increasing the number of chances that the null hypothesis will be incorrectly rejected, leading to false positive associations.

Odds ratio

The ratio of odds of an outcome occurring in one group to the odds of it occurring in another group.

Population stratification

Also referred to as population substructure. Allele frequency differences between subpopulations within a study population due to ancestry differences or selection biases.

Relative risk

A measure that describes the risk of having the outcome of interest relative to exposure.

Yates correction for continuity

An adjustment to the χ 2 test statistic performed by subtracting 0.5 from the (O-E) value for each cell in a contingency table. The purpose of this correction is to account for sparse cells in the table which may introduce discontinuity with regard to the χ 2 distribution.


  1. 1.
    Vandiedonck C, Knight JC (2009) The human major histocompatibility complex as a paradigm in genomics research. Brief Funct Genomic Proteomic 8(5):379–394PubMedCrossRefGoogle Scholar
  2. 2.
    Mather KA, President M, Eastel S, Hollenbach J, Klitz W, Huttley G, Thomson G (2007) Long distance linkage disequilibrium between HLA microsatellites in two human populations. In: Hansen JA (ed) Immunobiology of the human MHC: proceedings of the 13th international histocompatibility workshop and congress. IHWG Press, SeattleGoogle Scholar
  3. 3.
    Ann B, Begovich WK, Steiner LL, Grams S, Suraj-Baker V, Hollenbach J, Trachtenberg E, Louie L, Zimmerman PA, Hill AVS, Stoneking M, Sasazuki T, Rickards O, Titanji VPK, Konenkov VI, Sartakova ML (1999) HLA DQ hapotypes in 15 different populations. In: Kasahara M (ed) Major histocompatibility complex: evolution, structure and function. Springer, TokyoGoogle Scholar
  4. 4.
    Petzel-Erler ML, Gorodezky C, Layrisse Z, Klitz W, Fainboim L, Vullo C, Bodmer J, Egea E, Navarrete C, Infante E, Alaez C, Olivo A, Debaz H, Bautista N, de la Rosa G, Vazquez MN, Navarro JL, Pujol MJ, Duran C, Schafhauser C, Faucz FR, Janzen M, Maciag P, Boldt ABW, Souza PSA, Probst CM, da Silva GF, Makhatadze N, Dominguez E, Montagnani S, Matos M, Martinez A, Herrera F, Hollenbach J, Thomson G, Pando M, Satz L, Larriba J, Fernandez G, Pesoa SA, Borosky A, Garavito G, Angel L, Brown J, Llop E (1997) Anthropology component report for region Latin-America: Amerindian and admixed populations. In: Charron D (ed) Genetic diversity of HLA: functional and medical implications. EDK Press, ParisGoogle Scholar
  5. 5.
    Trachtenberg EA, Hayes E, Hollenbach JA, Keyeaux G, Bernal J, Klitz W (1997) HLA class II variation and linkage disequilibrium in nine Amerindian and three African American tribes from Columbia. Results of expedicion humana. In: Charron D (ed) Genetic diversity of HLA: functional and medical implications. EDK Press, ParisGoogle Scholar
  6. 6.
    Trachtenberg E, Vinson M, Hayes E, Hsu YM, Houtchens K, Erlich H, Klitz W, Hsia Y, Hollenbach J (2007) HLA class I (A, B, C) and class II (DRB1, DQA1, DQB1, DPB1) alleles and haplotypes in the Han from southern China. Tissue Antigens 70:455–463PubMedCrossRefGoogle Scholar
  7. 7.
    Begovich AB, Moonsamy PV, Mack SJ, Barcellos LF, Steiner LL, Grams S, Suraj-Baker V, Hollenbach J, Trachtenberg E, Louie L, Zimmerman P, Hill AV, Stoneking M, Sasazuki T, Konenkov VI, Sartakova ML, Titanji VP, Rickards O, Klitz W (2001) Genetic variability and linkage disequilibrium within the HLA-DP region: analysis of 15 different populations. Tissue Antigens 57:424–439PubMedCrossRefGoogle Scholar
  8. 8.
    Bergstrom TF, Engkvist H, Erlandsson R, Josefsson A, Mack SJ, Erlich HA, Gyllensten U (1999) Tracing the origin of HLA-DRB1 alleles by microsatellite polymorphism. Am J Hum Genet 64:1709–1718PubMedCrossRefGoogle Scholar
  9. 9.
    Bergstrom TF, Mack SJ, Gyllensten U, Ehrlich HA (2000) Evolution of HLA-DRB loci, DRB1 lineages, and alleles: analyses of intron-1 and -2 sequences. In: Kasahara M (ed) Major histocompatibility complex: evolution, structure and function. Springer, TokyoGoogle Scholar
  10. 10.
    Crawford MH, Reddy BM, Martinez-Laso J, Mack SJ, Erlich HA (2001) Genetic variation among the Golla pastoral caste subdivisions of Andhra Pradesh, India, according to the HLA system. Hum Immunol 62:1031–1041PubMedCrossRefGoogle Scholar
  11. 11.
    Ellis JM, Mack SJ, Leke RF, Quakyi I, Johnson AH, Hurley CK (2000) Diversity is demonstrated in class I HLA-A and HLA-B alleles in Cameroon, Africa: description of HLA-A*03012, *2612, *3006 and HLA-B*1403, *4016, *4703. Tissue Antigens 56:291–302PubMedCrossRefGoogle Scholar
  12. 12.
    Erlich HA, Mack SJ, Bergstrom T, Gyllensten UB (1997) HLA class II alleles in Amerindian populations: implications for the evolution of HLA polymorphism and the colonization of the Americas. Hereditas 127:19–24PubMedCrossRefGoogle Scholar
  13. 13.
    Hildesheim A, Apple RJ, Chen CJ, Wang SS, Cheng YJ, Klitz W, Mack SJ, Chen IH, Hsu MM, Yang CS, Brinton LA, Levine PH, Erlich HA (2002) Association of HLA class I and II alleles and extended haplotypes with nasopharyngeal carcinoma in Taiwan. J Natl Cancer Inst 94:1780–1789PubMedCrossRefGoogle Scholar
  14. 14.
    Johansson A, Ingman M, Mack SJ, Erlich H, Gyllensten U (2008) Genetic origin of the Swedish Sami inferred from HLA class I and class II allele frequencies. Eur J Hum Genet 16:1341–1349PubMedCrossRefGoogle Scholar
  15. 15.
    Leffell MS, Fallin MD, Erlich HA, Fernandez-Vijna M, Hildebrand WH, Mack SJ, Zachary AA (2002) HLA antigens, alleles and haplotypes among the Yup’ik Alaska natives: report of the ASHI Minority Workshops, Part II. Hum Immunol 63:614–625PubMedCrossRefGoogle Scholar
  16. 16.
    Mack SJ, Bugawan TL, Moonsamy PV, Erlich JA, Trachtenberg EA, Paik YK, Begovich AB, Saha N, Beck HP, Stoneking M, Erlich HA (2000) Evolution of Pacific/Asian populations inferred from HLA class II allele frequency distributions. Tissue Antigens 55:383–400PubMedCrossRefGoogle Scholar
  17. 17.
    Mack SJ, Erlich HA (1998) HLA class II polymorphism in the Ticuna of Brazil: evolutionary implications of the DRB1*0807 allele. Tissue Antigens 51:41–50PubMedCrossRefGoogle Scholar
  18. 18.
    Mack SJ, Tu B, Lazaro A, Yang R, Lancaster AK, Cao K, Ng J, Hurley CK (2009) HLA-A, -B, -C, and -DRB1 allele and haplotype frequencies distinguish Eastern European Americans from the general European American population. Tissue Antigens 73:17–32PubMedCrossRefGoogle Scholar
  19. 19.
    MacKenzie J, Perry J, Ford AM, Jarrett RF, Greaves M (1999) JC and BK virus sequences are not detectable in leukaemic samples from children with common acute lymphoblastic leukaemia. Br J Cancer 81:898–899PubMedCrossRefGoogle Scholar
  20. 20.
    Meyer D, Single RM, Mack SJ, Erlich HA, Thomson G (2006) Signatures of demographic history and natural selection in the human major histocompatibility complex loci. Genetics 173:2121–2142PubMedCrossRefGoogle Scholar
  21. 21.
    Solberg OD, Mack SJ, Lancaster AK, Single RM, Tsai Y, Sanchez-Mazas A, Thomson G (2008) Balancing selection and heterogeneity across the classical human leukocyte antigen loci: a meta-analytic review of 497 population studies. Hum Immunol 69:443–464PubMedCrossRefGoogle Scholar
  22. 22.
    Tang TF, Hou L, Chen M, Belle I, Mack S, Lancaster A, Ho GY, Hwang WY, Alsagoff F, Ng J, Hurley CK (2007) HLA haplotypes in Singapore: a study of mothers and their cord blood units. Hum Immunol 68:430–438PubMedCrossRefGoogle Scholar
  23. 23.
    Tu B, Mack SJ, Lazaro A, Lancaster A, Thomson G, Cao K, Chen M, Ling G, Hartzman R, Ng J, Hurley CK (2007) HLA-A, -B, -C, -DRB1 allele and haplotype frequencies in an African American population. Tissue Antigens 69:73–85PubMedCrossRefGoogle Scholar
  24. 24.
    Cao K, Hollenbach J, Shi X, Shi W, Chopek M, Fernandez-Vina MA (2001) Analysis of the frequencies of HLA-A, B, and C alleles and haplotypes in the five major ethnic groups of the United States reveals high levels of diversity in these loci and contrasting distribution patterns in these populations. Hum Immunol 62:1009–1030PubMedCrossRefGoogle Scholar
  25. 25.
    Hollenbach JA, Thomson G, Cao K, Fernandez-Vina M, Erlich HA, Bugawan TL, Winkler C, Winter M, Klitz W (2001) HLA diversity, differentiation, and haplotype evolution in Mesoamerican Natives. Hum Immunol 62:378–390PubMedCrossRefGoogle Scholar
  26. 26.
    Khakoo SI, Chloe LT, Martin MP, Brooks CR, Gao X, Astemborski J, Cheng J, Goedert JJ, Vlahov D, Hilgartner M, Cox S, Little AM, Alexander GJ, Cramp ME, O’Brien S, Rosenberg WMC, Thomas DL, Carrington M (2004) HLA and NK cell inhibitory receptor genes in resolving hepatitis C virus infection. Science 305:872–874PubMedCrossRefGoogle Scholar
  27. 27.
    Williams AP, Bateman AR, Khakoo SI (2005) Hanging in the balance. KIR and their role in disease. Mol Interv 5:226–240PubMedCrossRefGoogle Scholar
  28. 28.
    Li Y, Zhang T, Ho C, Orange JS, Douglas SD, Ho WZ (2004) Natural killer cells inhibit hepatitis C virus expression. J Leukoc Biol 76:1171–1179PubMedCrossRefGoogle Scholar
  29. 29.
    Hollenbach JA, Ladner MB, Saeteurn K, Taylor KD, Mei L, Haritunians T, McGovern DPB, Erlich HA, Rotter JI, Trachtenberg EA (2009) Susceptibility to Crohn’s disease is mediated by KIR2DL2/KIR2DL3 heterozygosity and the HLA-C ligand. Immunogenetics 61(10):663–671PubMedCrossRefGoogle Scholar
  30. 30.
    Gaudieri S, Desantis D, McKinnon E, Moore C, Nolan D, Witt CS, Mallal SA, Christiansen FT (2005) Killer immunoglobulin-like receptors and HLA act both independently and synergistically to modify HIV disease progression. Genes Immun 6:683–690PubMedGoogle Scholar
  31. 31.
    Khakoo SI, Carrington M (2006) KIR and disease: a model system or system of models? Immunol Rev 214:186–201PubMedCrossRefGoogle Scholar
  32. 32.
    Kunert K, Seiler M, Mashreghi MF, Klippert K, Schonemann C, Neumann K, Pratschke J, Reinke P, Volk HD, Kotsch K (2007) KIR/HLA ligand incompatibility in kidney transplantation. Transplantation 84:1527–1533PubMedCrossRefGoogle Scholar
  33. 33.
    Gedil MA, Steiner NK, Hurley CK (2007) KIR3DL2: diversity in a hematopoietic stem cell transplant population. Tissue Antigens 70:228–232PubMedCrossRefGoogle Scholar
  34. 34.
    Sun JY, Gaidulis L, Dagis A, Palmer J, Rodriguez R, Miller MM, Forman SJ, Senitzer D (2005) Killer Ig-like receptor (KIR) compatibility plays a role in the prevalence of acute GVHD in unrelated hematopoietic cell transplants for AML. Bone Marrow Transplant 36:525–530PubMedCrossRefGoogle Scholar
  35. 35.
    Lanier LL (1999) Natural killer cells fertile with receptors for HLA-G? Proc Natl Acad Sci USA 96:5343–5345PubMedCrossRefGoogle Scholar
  36. 36.
    Moffett A, Hiby SE (2007) How does the maternal immune system contribute to the development of pre-eclampsia? Placenta 28(suppl A):S51–S56PubMedCrossRefGoogle Scholar
  37. 37.
    Hu LP, Tao LX (2009) (Statistical analysis using SAS software package for data of RxC contingency table (part two)]. Zhong Xi Yi Jie He Xue Bao 7:878–882PubMedCrossRefGoogle Scholar
  38. 38.
    Horstmann GA (1988) [Product of the month: SPSS statistical package for the sciences]. Radiologe 28:438–439PubMedGoogle Scholar
  39. 39.
    R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
  40. 40.
    Thomson G, Valdes AM, Noble JA, Kockum I, Grote MN, Najman J, Erlich HA et al (2007) Relative predispositional effects of HLA class II DRB1-DQB1 haplotypes and genotypes on type 1 diabetes: a meta-analysis. Tissue Antigens 70:110–127PubMedCrossRefGoogle Scholar
  41. 41.
    Hollenbach JA, Thompson SD, Bugawan TL, Ryan M, Sudman M, Marion M, Langefeld CD, Thomson G, Erlich HA, Glass DN (2010) Juvenile idiopathic arthritis and HLA class I and class II interactions and age-at-onset effects. Arthritis Rheum 62:1781–1791PubMedCrossRefGoogle Scholar
  42. 42.
    Thomson G, Marthandan N, Hollenbach JA, Mack SJ, Erlich HA, Single RM, Waller MJ, Marsh SG, Guidry PA, Karp DR, Scheuermann RH, Thompson SD, Glass DN, Helmberg W (2010) Sequence feature variant type (SFVT) analysis of the HLA genetic association in juvenile idiopathic arthritis. Pac Symp Biocomput:359–370Google Scholar
  43. 43.
    Thomson G, Valdes AM (2007) Conditional genotype analysis: detecting secondary disease loci in linkage disequilibrium with a primary disease locus. BMC Proc 1(suppl 1):S163PubMedCrossRefGoogle Scholar
  44. 44.
    Thomson G, Marthandan N, Hollenbach JA, Mack SJ, Erlich HA, Single RM, Waller MJ, Marsh SGE, Guidry PA, Karp DR, Scheuermann RH, Thompson SD, Glass DB, Helmberg W (2009) Sequence feature variant type (SFVT) analysis of the HLA genetic association in juvenile idiopathic arthritis. Proceedings of the 2010 pacific society of biocomputing conference. Pac Symp Biocomput:359–370Google Scholar
  45. 45.
    Clopper CJ, Pearson ES (1934) The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26:404–413CrossRefGoogle Scholar
  46. 46.
    Nemes S, Jonasson JM, Genell A, Steineck G (2009) Bias in odds ratios by logistic regression modelling and sample size. BMC Med Res Methodol 9:56PubMedCrossRefGoogle Scholar
  47. 47.
    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575PubMedCrossRefGoogle Scholar
  48. 48.
    Rubinstein P, Walker M, Carpenter C, Carrier C, Krassner J, Falk C, Ginsberg F (1981) Genetics of HLA disease associations: the use of the haplotype relative risk (HRR) and the “haplo-delta” (Dh) estimates in juvenile diabetes from three racial groups. Hum Immunol 3:384CrossRefGoogle Scholar
  49. 49.
    Falk CT, Rubinstein P (1987) Haplotype relative risk: an easy reliable way to construct a proper control sample for risk calculations. Ann Hum Genet 51:227–233PubMedCrossRefGoogle Scholar
  50. 50.
    Khoury MJ, Beaty TH (1994) Applications of the case-control method in genetic epidemiology. Epidemiol Rev 16:134–150PubMedGoogle Scholar
  51. 51.
    Schaid DJ, Sommer SS (1994) Comparison of statistics for candidate-gene association studies using cases and parents. Am J Hum Genet 55:402–409PubMedGoogle Scholar
  52. 52.
    Thomson G (1995) Mapping disease genes: family-based association studies. Am J Hum Genet 57:487–498PubMedGoogle Scholar
  53. 53.
    Field LL (1991) Non-HLA region genes in insulin dependent diabetes mellitus. Baillieres Clin Endocrinol Metab 5:413–438PubMedCrossRefGoogle Scholar
  54. 54.
    Thomson G (1993) AGFAP method: applicability under different ascertainment schemes and a parental contributions test. Genet Epidemiol 10:289–310PubMedCrossRefGoogle Scholar
  55. 55.
    Thomson G (1995) Analysis of complex human genetic traits: an ordered-notation method and new tests for mode of inheritance. Am J Hum Genet 57:474–486PubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Jill A. Hollenbach
    • 1
    Email author
  • Steven J. Mack
    • 1
  • Glenys Thomson
    • 2
  • Pierre-Antoine Gourraud
    • 3
  1. 1.Center for GeneticsChildren’s Hospital and Research Center OaklandOaklandUSA
  2. 2.Department of Integrative BiologyUniversity of CaliforniaBerkeleyUSA
  3. 3.Department of NeurologyUniversity of CaliforniaSan FranciscoUSA

Personalised recommendations