Analytical Methods for Disease Association Studies with Immunogenetic Data
Disease association studies involving highly polymorphic immunogenetic data may involve analyses at one or many units of analysis, including amino acid, allele, genotype and haplotype levels, as well as consideration of gene–gene or gene–environment interactions. The selection of the appropriate statistical tests is critical and will be dependent on the nature of the dataset (e.g., case-control vs. family data) as well as the specific research hypotheses being tested. This paper describes the various study and analysis categories used for such analyses, including the advantages and limitations of such techniques.
Key wordsHLA KIR Immunogenetic Data analysis Disease association Case-control Family
This work was supported by National Institutes of Health (NIH) grants U01AI067068 (JAH, SJM) and U19 AI067152 (PAG) awarded by the National Institute of Allergy and Infectious Diseases (NIAID) and 1R01DK061722 (JAH) awarded by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
The most commonly used method of correcting for multiple comparisons. Generally, the test significance level is divided by the number of comparisons made in the study, thereby increasing the overall stringency of the significance testing.
A likely range of values for an estimate of a particular parameter within a particular level of significance.
Also known as a cross-tabulation. A 2 × n table used to analyze heterogeneity between two sets of observations of two or more categorical variables.
The occurrence within a population, greater than that expected by chance, of a genetic trait with a particular phenotype.
A statistical method to determine which in a set of independent variables has a predictive relationship to a binary-dependent outcome variable.
Also know as multiple testing. Performing a statistical test multiple times in the same analysis, thereby increasing the number of chances that the null hypothesis will be incorrectly rejected, leading to false positive associations.
The ratio of odds of an outcome occurring in one group to the odds of it occurring in another group.
Also referred to as population substructure. Allele frequency differences between subpopulations within a study population due to ancestry differences or selection biases.
A measure that describes the risk of having the outcome of interest relative to exposure.
An adjustment to the χ 2 test statistic performed by subtracting 0.5 from the (O-E) value for each cell in a contingency table. The purpose of this correction is to account for sparse cells in the table which may introduce discontinuity with regard to the χ 2 distribution.
- 2.Mather KA, President M, Eastel S, Hollenbach J, Klitz W, Huttley G, Thomson G (2007) Long distance linkage disequilibrium between HLA microsatellites in two human populations. In: Hansen JA (ed) Immunobiology of the human MHC: proceedings of the 13th international histocompatibility workshop and congress. IHWG Press, SeattleGoogle Scholar
- 3.Ann B, Begovich WK, Steiner LL, Grams S, Suraj-Baker V, Hollenbach J, Trachtenberg E, Louie L, Zimmerman PA, Hill AVS, Stoneking M, Sasazuki T, Rickards O, Titanji VPK, Konenkov VI, Sartakova ML (1999) HLA DQ hapotypes in 15 different populations. In: Kasahara M (ed) Major histocompatibility complex: evolution, structure and function. Springer, TokyoGoogle Scholar
- 4.Petzel-Erler ML, Gorodezky C, Layrisse Z, Klitz W, Fainboim L, Vullo C, Bodmer J, Egea E, Navarrete C, Infante E, Alaez C, Olivo A, Debaz H, Bautista N, de la Rosa G, Vazquez MN, Navarro JL, Pujol MJ, Duran C, Schafhauser C, Faucz FR, Janzen M, Maciag P, Boldt ABW, Souza PSA, Probst CM, da Silva GF, Makhatadze N, Dominguez E, Montagnani S, Matos M, Martinez A, Herrera F, Hollenbach J, Thomson G, Pando M, Satz L, Larriba J, Fernandez G, Pesoa SA, Borosky A, Garavito G, Angel L, Brown J, Llop E (1997) Anthropology component report for region Latin-America: Amerindian and admixed populations. In: Charron D (ed) Genetic diversity of HLA: functional and medical implications. EDK Press, ParisGoogle Scholar
- 5.Trachtenberg EA, Hayes E, Hollenbach JA, Keyeaux G, Bernal J, Klitz W (1997) HLA class II variation and linkage disequilibrium in nine Amerindian and three African American tribes from Columbia. Results of expedicion humana. In: Charron D (ed) Genetic diversity of HLA: functional and medical implications. EDK Press, ParisGoogle Scholar
- 7.Begovich AB, Moonsamy PV, Mack SJ, Barcellos LF, Steiner LL, Grams S, Suraj-Baker V, Hollenbach J, Trachtenberg E, Louie L, Zimmerman P, Hill AV, Stoneking M, Sasazuki T, Konenkov VI, Sartakova ML, Titanji VP, Rickards O, Klitz W (2001) Genetic variability and linkage disequilibrium within the HLA-DP region: analysis of 15 different populations. Tissue Antigens 57:424–439PubMedCrossRefGoogle Scholar
- 9.Bergstrom TF, Mack SJ, Gyllensten U, Ehrlich HA (2000) Evolution of HLA-DRB loci, DRB1 lineages, and alleles: analyses of intron-1 and -2 sequences. In: Kasahara M (ed) Major histocompatibility complex: evolution, structure and function. Springer, TokyoGoogle Scholar
- 13.Hildesheim A, Apple RJ, Chen CJ, Wang SS, Cheng YJ, Klitz W, Mack SJ, Chen IH, Hsu MM, Yang CS, Brinton LA, Levine PH, Erlich HA (2002) Association of HLA class I and II alleles and extended haplotypes with nasopharyngeal carcinoma in Taiwan. J Natl Cancer Inst 94:1780–1789PubMedCrossRefGoogle Scholar
- 24.Cao K, Hollenbach J, Shi X, Shi W, Chopek M, Fernandez-Vina MA (2001) Analysis of the frequencies of HLA-A, B, and C alleles and haplotypes in the five major ethnic groups of the United States reveals high levels of diversity in these loci and contrasting distribution patterns in these populations. Hum Immunol 62:1009–1030PubMedCrossRefGoogle Scholar
- 26.Khakoo SI, Chloe LT, Martin MP, Brooks CR, Gao X, Astemborski J, Cheng J, Goedert JJ, Vlahov D, Hilgartner M, Cox S, Little AM, Alexander GJ, Cramp ME, O’Brien S, Rosenberg WMC, Thomas DL, Carrington M (2004) HLA and NK cell inhibitory receptor genes in resolving hepatitis C virus infection. Science 305:872–874PubMedCrossRefGoogle Scholar
- 39.R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
- 42.Thomson G, Marthandan N, Hollenbach JA, Mack SJ, Erlich HA, Single RM, Waller MJ, Marsh SG, Guidry PA, Karp DR, Scheuermann RH, Thompson SD, Glass DN, Helmberg W (2010) Sequence feature variant type (SFVT) analysis of the HLA genetic association in juvenile idiopathic arthritis. Pac Symp Biocomput:359–370Google Scholar
- 44.Thomson G, Marthandan N, Hollenbach JA, Mack SJ, Erlich HA, Single RM, Waller MJ, Marsh SGE, Guidry PA, Karp DR, Scheuermann RH, Thompson SD, Glass DB, Helmberg W (2009) Sequence feature variant type (SFVT) analysis of the HLA genetic association in juvenile idiopathic arthritis. Proceedings of the 2010 pacific society of biocomputing conference. Pac Symp Biocomput:359–370Google Scholar