, Volume 65, Issue 9, pp 655–665 | Cite as

MHCcluster, a method for functional clustering of MHC molecules

  • Martin Thomsen
  • Claus Lundegaard
  • Søren Buus
  • Ole Lund
  • Morten Nielsen
Original Paper


The identification of peptides binding to major histocompatibility complexes (MHC) is a critical step in the understanding of T cell immune responses. The human MHC genomic region (HLA) is extremely polymorphic comprising several thousand alleles, many encoding a distinct molecule. The potentially unique specificities remain experimentally uncharacterized for the vast majority of HLA molecules. Likewise, for nonhuman species, only a minor fraction of the known MHC molecules have been characterized. Here, we describe a tool, MHCcluster, to functionally cluster MHC molecules based on their predicted binding specificity. The method has a flexible web interface that allows the user to include any MHC of interest in the analysis. The output consists of a static heat map and graphical tree-based visualizations of the functional relationship between MHC variants and a dynamic TreeViewer interface where both the functional relationship and the individual binding specificities of MHC molecules are visualized. We demonstrate that conventional sequence-based clustering will fail to identify the functional relationship between molecules, when applied to MHC system, and only through the use of the predicted binding specificity can a correct clustering be found. Clustering of prevalent HLA-A and HLA-B alleles using MHCcluster confirms the presence of 12 major specificity groups (supertypes) some however with highly divergent specificities. Importantly, some HLA molecules are shown not to fit any supertype classification. Also, we use MHCcluster to show that chimpanzee MHC class I molecules have a reduced functional diversity compared to that of HLA class I molecules. MHCcluster is available at


MHC HLA Binding motif Functional clustering MHC specificity Supertypes 



MN is researcher at the Argentinean National Research Council (CONICET). This work was supported by NIH grant HHSN272200900045C.


  1. Altschul SF, Madden TL et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  2. de Groot NG, Heijmans CM et al (2008) Pinpointing a selective sweep to the chimpanzee MHC class I region by comparative genomics. Mol Ecol 17(8):2074–2088PubMedCrossRefGoogle Scholar
  3. Doytchinova IA, Guan P et al (2004) Identifiying human MHC supertypes using bioinformatic methods. J Immunol 172(7):4314–4323PubMedGoogle Scholar
  4. Erup Larsen M, Kloverpris H et al (2011) HLArestrictor-a tool for patient-specific predictions of HLA restriction elements and optimal epitopes within peptides. Immunogenetics 63(1):43–55PubMedCrossRefGoogle Scholar
  5. Harndahl M, Justesen S et al (2009) Peptide binding to HLA class I molecules: homogenous, high-throughput screening, and affinity assays. J Biomol Screen 14(2):173–180PubMedCrossRefGoogle Scholar
  6. Harndahl M, Rasmussen M et al (2011) Real-time, high-throughput measurements of peptide-MHC-I dissociation using a scintillation proximity assay. J Immunol Methods 374(1–2):5–12PubMedCrossRefGoogle Scholar
  7. Hertz T, Yanover C (2007) Identifying HLA supertypes by learning distance functions. Bioinformatics 23(2):e148–e155PubMedCrossRefGoogle Scholar
  8. Hobohm U, Scharf M et al (1992) Selection of representative protein data sets. Protein Sci 1:409–417PubMedCrossRefGoogle Scholar
  9. Hoof I, Peters B et al (2009) NetMHCpan, a method for MHC class I binding prediction beyond humans. Immunogenetics 61(1):1–13PubMedCrossRefGoogle Scholar
  10. Huson DH, Bryant D (2006) Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 23(2):254–267PubMedCrossRefGoogle Scholar
  11. Karosiene E, Lundegaard C et al (2011) NetMHCcons: a consensus method for the major histocompatibility complex class I predictions. Immunogenetics 64(3):177–186PubMedCrossRefGoogle Scholar
  12. Larkin MA, Blackshields G et al (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947–2948PubMedCrossRefGoogle Scholar
  13. Lund O, Nielsen M et al (2004) Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics 55(12):797–810PubMedCrossRefGoogle Scholar
  14. Lundegaard C, Lund O et al (2010) Major histocompatibility complex class I binding predictions as a tool in epitope discovery. Immunology 130(3):309–318PubMedCrossRefGoogle Scholar
  15. Middleton D, Menchaca L et al (2003) “New allele frequency database:” Tissue Antigens 61(5):403–407
  16. NCBI Resource Coordinators (2013) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 41(Database issue):D8–D20Google Scholar
  17. Nielsen M, Lundegaard C et al (2007) NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence. PLoS One 2(8):e796PubMedCrossRefGoogle Scholar
  18. Nielsen M, Lundegaard C et al (2008) Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan. PLoS Comput Biol 4(7):e1000107PubMedCrossRefGoogle Scholar
  19. Nielsen M, Justesen S et al (2010a) NetMHCIIpan-2.0—improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure. Immunol Res 6:9CrossRefGoogle Scholar
  20. Nielsen M, Lund O et al (2010b) MHC class II epitope predictive algorithms. Immunology 130(3):319–328PubMedCrossRefGoogle Scholar
  21. Rao X, Costa AI et al (2009) A comparative study of HLA binding affinity and ligand diversity: implications for generating immunodominant CD8+ T cell responses. J Immunol 182(3):1526–1532PubMedGoogle Scholar
  22. Rapin N, Hoof I et al (2008) MHC motif viewer. Immunogenetics 60(12):759–765PubMedCrossRefGoogle Scholar
  23. Rapin N, Hoof I et al (2010) The MHC motif viewer: a visualization tool for MHC binding motifs. Current protocols in immunology. Chapter 18: Unit 18.17. doi:10.1002/0471142735.im1817s88
  24. Robinson J, Marsh SG (2007) The IMGT/HLA database. Methods Mol Biol 409:43–60PubMedCrossRefGoogle Scholar
  25. Sette A, Sidney J (1999) Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism. Immunogenetics 50:201–212PubMedCrossRefGoogle Scholar
  26. Thomsen MC, Nielsen M (2012) Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion. Nucleic Acids Res 40 (Web Server issue):W281–287Google Scholar
  27. van Deutekom HW, Hoof I et al (2011) A comparative analysis of viral peptides presented by contemporary human and chimpanzee MHC class I molecules. J Immunology 187(11):5995–5600CrossRefGoogle Scholar
  28. Vita R, Zarebski L et al (2010) The immune epitope database 2.0. Nucleic Acids Res 38 (database issue):D854–862Google Scholar
  29. Yewdell JW, Bennink JR (1999) Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses. Annu Rev Immunol 17:51–88PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Martin Thomsen
    • 1
  • Claus Lundegaard
    • 1
    • 3
  • Søren Buus
    • 4
  • Ole Lund
    • 1
  • Morten Nielsen
    • 1
    • 2
  1. 1.Center for Biological Sequence Analysis, Department of Systems BiologyTechnical University of DenmarkLyngbyDenmark
  2. 2.Instituto de Investigaciones BiotecnológicasUniversidad Nacional de San MartínSan MartínArgentina
  3. 3.ALKHørsholmDenmark
  4. 4.Laboratory of Experimental ImmunologyUniversity of CopenhagenCopenhagen NDenmark

Personalised recommendations