HLA and HIV Infection Progression: Application of the Minimum Description Length Principle to Statistical Genetics

  • Peter T. Hraber
  • Bette T. Korber
  • Steven Wolinsky
  • Henry A. Erlich
  • Elizabeth A. Trachtenberg
  • Thomas B. Kepler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4345)


The minimum description length (MDL) principle was developed in the context of computational complexity and coding theory. It states that the best model to account for some data minimizes the sum of the lengths, in bits, of the descriptions of the model and the data as encoded via the model. The MDL principle gives a criterion for parameter selection, by using the description length as a test statistic. Class I HLA genes play a major role in the immune response to HIV, and are known to be associated with rates of progression to AIDS. However, these genes are extremely polymorphic, making it difficult to associate alleles with disease outcome, given statistical issues of multiple testing. Application of the MDL principle to immunogenetic data from a longitudinal cohort study (Chicago MACS) enables classification of alleles associated with plasma HIV RNA abundance, an indicator of infection progression. Variation in progression is strongly associated with HLA-B. Allele associations with viral levels support and extend previous studies. In particular, individuals without B58s supertype alleles average viral RNA levels 3.6 times greater than individuals with them. Mechanisms for these associations include variation in epitope specificity and selection that favors rare alleles.


Human Leukocyte Antigen Minimum Description Length Human Leukocyte Antigen Allele Human Leukocyte Antigen Gene Human Leukocyte Antigen Molecule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    McMichael, A.J., Rowland-Jones, S.L.: Cellular immune responses to HIV. Nature 410, 980–987 (2001)CrossRefGoogle Scholar
  2. 2.
    Mellors, J.W., Rinaldo Jr., C.R., Gupta, P., White, R.M., Todd, J.A., Kingsley, L.A.: Prognosis in HIV-1 infection predicted by the quantity of virus in plasma. Science 272, 1167–1170 (1996)CrossRefGoogle Scholar
  3. 3.
    Germain, R.N.: Antigen processing and presentation. In: Paul, W.E. (ed.) Fundamental Immunology, 4th edn., Lippincott-Raven, Philadelphia, pp. 287–340 (1999)Google Scholar
  4. 4.
    Williams, A., Au Peh, C., Elliott, T.: The cell biology of MHC class I antigen presentation. Tissue Antigens 59, 3–17 (2002)CrossRefGoogle Scholar
  5. 5.
    Bodmer, W.F.: Evolutionary significance of the HL-A system. Nature 237, 139–145 (1972)CrossRefGoogle Scholar
  6. 6.
    Little, A.M., Parham, P.: Polymorphism and evolution of HLA class I and II genes and molecules. Rev. Immunogenet. 1, 105–123 (1999)Google Scholar
  7. 7.
    Hill, A.V.S.: The immunogenetics of human infectious diseases. Ann. Rev. Immunol. 16, 593–617 (1998)CrossRefGoogle Scholar
  8. 8.
    Roger, M.: Influence of host genes on HIV-1 disease progression. FASEB J. 12, 625–632 (1998)Google Scholar
  9. 9.
    Carrington, M., Nelson, G.W., Martin, M.P., Kissner, T., Vlahov, D., Goedert, J.J., Kaslow, R., Buchbinder, S., Hoots, K., O’Brien, S.J.: HLA and HIV-1: heterozygote advantage and B*35-C w*04 disadvantage. Science 283, 1748–1752 (1999)CrossRefGoogle Scholar
  10. 10.
    Trachtenberg, E.A., Korber, B.T., Sollars, C., Kepler, T.B., Hraber, P.T., Hayes, E., Funkhouser, R., Fugate, M., Theiler, J., Hsu, M., Kunstman, K., Wu, S., Phair, J., Erlich, H.A., Wolinsky, S.: Advantage of rare HLA supertype in HIV disease progression. Nat. Med. 9, 928–935 (2003)CrossRefGoogle Scholar
  11. 11.
    Trachtenberg, E.A., Erlich, H.A.: A review of the role of the human leukocyte antigen (HLA) system as a host immunogenetic factor influencing HIV transmission and progression to AIDS. In: Korber, B.T., Brander, C., Haynes, B.F., Koup, R., Kuiken, C., Moore, J.P., Walker, B.D., Watkins, D. (eds.) HIV Molecular Immunology 2001. Theoretical Biology and Biophysics Group, Los Alamos. LANL, vol. I, pp. 43–60 (2001)Google Scholar
  12. 12.
    Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore (1989)zbMATHGoogle Scholar
  13. 13.
    Li, M., Vitányi, P.: An Introduction to Kolmogorov Complexity and its Applications. Springer, New York (1993)Google Scholar
  14. 14.
    Hansen, M.H., Yu, B.: Model selection and minimum description length principle. J. Am. Stat. Assoc. 96, 746–774 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, 2nd edn., vol. 1. Wiley Interscience, New York (1994)zbMATHGoogle Scholar
  16. 16.
    Sette, A., Sidney, J.: Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism. Immunogenetics 50, 201–212 (1999)CrossRefGoogle Scholar
  17. 17.
    Lindley, D.V.: Approximate Bayesian Methods. In: Bernardo, J.M., DeGroot, M.H., Lindley, D.V., Smith, A.F.M. (eds.) Bayesian Statistics, pp. 223–237. Valencia University Press, Valencia (1980)Google Scholar
  18. 18.
    Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S-PLUS, 3rd edn. Springer, New York (1999)zbMATHGoogle Scholar
  19. 19.
    Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Prob. Inform. Transmission 1, 4–7 (1965)MathSciNetGoogle Scholar
  20. 20.
    Chaitin, G.J.: On the lengths of programs for computing binary sequences. J. Assoc. Comput. Mach. 13, 547–569 (1966)zbMATHMathSciNetGoogle Scholar
  21. 21.
    Chaitin, G.J.: Algorithmic Information Theory. Cambridge University Press, Cambridge (1987)CrossRefGoogle Scholar
  22. 22.
    Rissanen, J.: Stochastic complexity and modeling. Ann. Statist. 14, 1080–1100 (1986)zbMATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Rissanen, J.: Hypothesis selection and testing by the MDL principle. Comput. J. 42, 260–269 (1999)zbMATHCrossRefGoogle Scholar
  24. 24.
    Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd edn. Springer, New York (2002)zbMATHGoogle Scholar
  25. 25.
    Nelson, G.W., Kaslow, R., Mann, D.L.: Frequency of HLA allele-specific peptide motifs in HIV-1 proteins correlates with the allele’s association with relative rates of disease progression after HIV-1 infection. Proc. Natl. Acad. Sci (USA) 94, 9802–9807 (1997)CrossRefGoogle Scholar
  26. 26.
    Kiepiela, P., Leslie, A.J., Honeyborne, I., Ramduth, D., Thobakgale, C., Chetty, S., Rathnavalu, P., Moore, C., Pfafferott, K.J., Hilton, L., Zimbwa, P., Moore, S., Allen, T., Brander, C., Addo, M.M., Altfeld, M., James, I., Mallal, S., Bunce, M., Barber, L.D., Szinger, J., Day, C., Klenerman, P., Mullins, J., Korber, B., Coovadia, H.M., Walker, B.D., Goulder, P.J.R.: Dominant influence of HLA-B in mediating the potential co-evolution of HIV and HLA. Nature 432, 769–774 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Peter T. Hraber
    • 1
    • 2
  • Bette T. Korber
    • 1
    • 2
  • Steven Wolinsky
    • 3
  • Henry A. Erlich
    • 4
  • Elizabeth A. Trachtenberg
    • 5
  • Thomas B. Kepler
    • 6
  1. 1.Santa Fe InstituteSanta FeUSA
  2. 2.Los Alamos National LaboratoryLos AlamosUSA
  3. 3.Feinberg School of MedicineNorthwestern UniversityChicagoUSA
  4. 4.Roche Molecular SystemsAlamedaUSA
  5. 5.Children’s Hospital Oakland Research InstituteOaklandUSA
  6. 6.Department of Biostatistics and Bioinformatics, Duke University Medical CenterDuke UniversityDurhamUSA

Personalised recommendations