ISAG/IUIS-VIC Comparative MHC Nomenclature Committee report, 2005
- First Online:
- Cite this article as:
- Ellis, S.A., Bontrop, R.E., Antczak, D.F. et al. Immunogenetics (2006) 57: 953. doi:10.1007/s00251-005-0071-4
- 235 Views
Nomenclature for Major Histocompatibility Complex (MHC) genes and alleles in species other than humans and mice has historically been overseen either informally by groups generating sequences, or by formal nomenclature committees set up by the International Society for Animal Genetics (ISAG). The suggestion for a Comparative MHC Nomenclature Committee was made at the ISAG meeting held in Göttingen, Germany (2002), and the committee met for the first time at the Institute for Animal Health, Compton, UK in January 2003. To publicize its activity and extend its scope, the committee organized a workshop at the International Veterinary Immunology Symposium (IVIS) in Quebec (2004) where it was decided to affiliate with the Veterinary Immunology Committee (VIC) of the International Union of Immunological Societies (IUIS). The goals of the committee are to establish a common framework and guidelines for MHC nomenclature in any species; to demonstrate this in the form of a database that will ensure that in the future, researchers can easily access a source of validated MHC sequences for any species; to facilitate discussion on this area between existing groups and nomenclature committees. A further meeting of the committee was held in September 2005 in Glasgow, UK. This was attended by most of the existing committee members with some additional invited participants (Table 1). The aims of this meeting were to facilitate the inclusion of new species onto the database, to discuss extension, improvement and funding of the database, and to address a number of nomenclature issues raised at the previous workshop.
Comparative MHC Nomenclature Committee membership
Cornell University, USA
Moredun Research Institute, UK
Washington State University, USA
Institute for Animal Health, UK
Institute for Animal Health, UK
University of Manchester, UK
Anthony Nolan Research Institute, UK
Anthony Nolan Research Institute, UK
Baylor University Medical Center, TX, USA
Glasgow University, UK
University of Aberdeen, UK
Anthony Nolan Research Institute, UK
German Primate Centre, Göttingen, Germany
The MHC region was investigated in many species (Parham 1999; Kelley et al. 2005), and numerous different nomenclature systems are used to name genes and alleles, both between and within species (Klein et al. 1990). By creating species-groups on IPD-MHC, it is hoped that much of the confusion surrounding MHC nomenclature will be dispelled, and that comparative MHC studies will be greatly facilitated. As each of the sections within IPD-MHC is generally based on the work of a nomenclature committee, the website includes or links to a portable document format file of recent nomenclature reports. The website also provides a recognized location for updates to the nomenclature between published reports. These pages contain the official allele name, any previous designations, the EMBL, GenBank or DDBJ accession number(s), and a reference linked wherever possible in the PubMed abstract. Details on the source of sequence are also provided where applicable. Some nomenclature committees may provide additional information but the core components of any nomenclature reported are the allele names, accession numbers, and publications.
The website also provides access to sequence alignments; these are provided using a common interface design. The alignment tool uses standard formatting conventions for the display of sequence alignments; these are based on those currently used by the IMGT/HLA Sequence Database (Robinson et al. 2000) and published recommendations for Human Gene Mutations (Antonarakis 1998; den Dunnen and Antonarakis 2001). The alignment tool options allow the user to display a subset of alleles of a particular locus, omit alleles that are unsequenced for a particular region, and align against a particular reference or consensus sequence. In addition, the sequences can be displayed as complete nucleotide sequence, partial sequences of single exons, or the amino acid sequence of the encoded protein. The IPD-MHC database is also available via FTP (ftp://ftp.ebi.ac.uk/pub/databases/ipd/mhc/). The FTP site provides the sequences in a number of predefined file formats including EMBL-like flat files, FASTA, and PIR formats. These formats can be used in many commonly used applications like BLAST and Clustal.
The first release of the IPD-MHC database incorporated data from groups specializing in nonhuman primates, canines (DLA) (Kennedy et al. 2001), and felines (FLA) and incorporated all data previously available in the IMGT/MHC database (Robinson et al. 2003). Since this release, sequences from cattle (BoLA) (Russell et al. 1997; Davies et al. 1997), rat (RT1) and swine (SLA) (Smith et al. 2005a,b) were added. Work is underway to include chicken (B and Y), horse (ELA), sheep (OLA), and fish sequences.
Naming of new MHC genes
Historically, the MHC in many species was named ‘LA’ to maintain some consistency with the human (HLA) naming system, for example SLA for pig, DLA for dog and BoLA for cattle. There were a number of problems with this nomenclature, the main one being that while in some cases the common name of a species was used, as in dog, in other cases a broader specificity was used, as in BoLA (potentially encompassing all Bovinae). Another problem was that in certain species, the MHC was given a completely different name as in H2 for mice, RT1 for rat, and B for chicken, that has been maintained for historical reasons.
In 1990 it was proposed (Klein et al. 1990) that a new MHC nomenclature be used in which genes were prefixed Mhc followed by a four-letter abbreviation of the species' scientific name. For example, DLA would become MhcCafa, with the first two letters derived from the genus Canis, and the last two from the species name, familiaris. It was apparent that this would lead to some identical names being generated for different species, and so the rule was that if a second species had the same letter combination, the next two letters of the genus or species name would be used. Klein et al. (1990) suggested that some of the well-established names be changed, for example, DLA to Cafa, BoLA to Bota, SLA to Sudo, and ELA to Ecqa, but these were not universally adopted, and in each case researchers in these fields have for the most part retained the original names. However, this can pose a problem as in the case of canines, for example, where the DLA prefix is currently being used to name alleles shared between domestic dog, several wolf species, and coyote, and in some cases alleles found only in the latter species. This problem is likely to increase as many more sequences from different species are now being generated, and it is likely that allele sharing will be found to occur between very closely related species.
This may therefore be a good time to extend the use of the ‘new’ nomenclature, or to at least adopt some of its more useful features. Each species group will make use of the nomenclature in the most appropriate way. For example, the sheep MHC will still be termed OLA, but individual alleles will be named according to their species, for example a domestic sheep (Ovis aries) allele might be OLA-Ovar-DRB1*00101. If an identical sequence (full-length) is found in more than one species, it could carry two names, thus, this allele would also be named OLA-Ovca-DRB*00101 if it was found in Bighorn sheep (Ovis canadensis). Within the Bovidae subfamily Bovinae, (Hassanin and Ropiquet 2004) it was decided to maintain the overall description BoLA, but in contrast to sheep, different types of cattle (Bos taurus and Bos indicus) will not be specified. This is because a high proportion of animals designated B. indicus are in fact hybrids. However, when MHC sequences from additional Bovinae species, e.g., American bison, are placed on IPD-MHC, these will be given an appropriate four-letter prefix.
Bos taurus/Bos indicus
B and Y
Canis lupus baileyi
Non-human primate (NHP)
Bonobo or pygmy chimpanzee
de Brazza's monkey
Northern night (owl) monkey
Northern night (owl) monkey
Northern night (owl) monkey
Long-haired spider monkey
Brown-headed spider monkey
Duski titi monkey
Golden lion tamarin
Saddle back tamarin
Common squirrel monkey
New IPD sections: sheep, horse, fish, and chicken
The ovine section of IPD-MHC will contain information on the ovine leukocyte antigen (OLA) complex in domestic sheep O. aries and other species within the genus Ovis. Sequences derived from domestic sheep will be prefixed Ovar, and other species will be assigned appropriate names (see previous section). Classes I and II nomenclature will be broadly based on the HLA nomenclature system, and only full-length class I cDNA sequences will be included on IPD-MHC. Class I alleles will, for the time being, be prefixed ‘N’ for ‘not assigned’ to a locus as in the cattle (BoLA) section. Full details of nomenclature and conditions for submission and inclusion of new alleles can be found on IPD-MHC.
The forthcoming equine section of IPD-MHC will contain information on the ELA complex of the domestic horse, Equus caballus, and other member species of the genus Equus. Sequences derived from horses will be prefixed Eqca, and other species such as donkeys and zebra will be assigned appropriate names (see previous section). Nomenclature for the horse MHC classes I and II genes and alleles will be based on the HLA system. Only full-length, or nearly full-length, MHC class I cDNA sequences will be included on IPD-MHC. Class I alleles will be assigned to loci when possible, and if not, they will be prefixed ‘N’ for ‘not assigned’ to a locus, following the system used for cattle in the BoLA section. Full details of nomenclature and conditions for submission and inclusion of new alleles can be found on IPD-MHC.
The MHC classes Ia and II genes in bony fish are not located in a complex, but are found on different chromosomes (Stet et al. 2003). This is a unique feature that sets the bony fish aside from all other vertebrates including cartilaginous fish, which do possess a bona fide MHC. This poses a semantic problem when calling the major histocompatibility genes of bony fish MHC genes. Several authors have attempted to introduce the use of ‘MH’ when describing MHC genes in a number of fish species. Although MH genes were identified in a large number of fish species, those studied well are the salmonid fish (Atlantic salmon, rainbow trout, brown trout, and several Pacific salmon species). Also, the standard use of the four-letter abbreviation and an agreed locus assignment is widely used in salmonids. By far the largest number of expressed class I and class II alleles (full-length and partial) are described for Atlantic salmon (Salmo salar: Sasa) and rainbow trout (Oncorhynchus mykiss: Onmy). The class Ia locus assignment is UBA and for the single class II loci DAA/DAB. The IPD-MHC fish section database will initially include only these two salmonid species. In the future, model species like zebrafish (Danio rerio: Dare) and three-spined stickleback (Gasterosteus aculeatus: Gaac) may also be included.
The chicken section of IPD-MHC will contain information about selected genes on chicken chromosome 16. This microchromosome is divided into two sides by the Nucleolar Organizing Region, which encodes rRNAs. One side is known to contain BLA (the classical class II A gene) and the B complex including the classical MHC and BG genes, lectin-like genes, and CD1 genes, while the other side is known to contain the factor B gene and the Y locus including nonclassical class I and class II B genes and lectin-like genes. Various researchers in the field have met as ad hoc nomenclature committees at several meetings, and a recommended nomenclature for the genes on chromosome 16 was published (Miller et al. 2004). For stage of the IPD-MHC, only full-length coding sequences from cDNA or from complete genes will be displayed initially, although a larger list including partial sequences (for instance exon two and three for BF and YF alleles and exon two for BLB and YLB alleles) may be established at a later date. The sequences will be named in accordance with the recommended nomenclature, but only those sequences whose genomic location was unambiguously established will be assigned to particular loci; all others will be prefixed ‘N’ for ‘not assigned.’ MHC sequences from other bird species may be added in the future.
Inclusion of nonvalidated sequences on IPD-MHC
There is some concern that by directing attention toward IPD-MHC, a great deal of potentially useful MHC sequence data on public databases will be effectively lost by being ignored. Curators of some species groups felt that at least some of these data could be useful, even if they do not (at present) fulfill their own criteria for inclusion on IPD-MHC. A number of approaches could help resolve this problem. Selected sequences could be placed in a ‘pending’ section on IPD-MHC, which could be publicly accessible while clearly labeled as ‘non-validated,’ or could be hidden from public view. Another alternative would be to approach groups that have placed a significant number of sequences on public databases to encourage them to validate and subsequently submit their sequences to IPD-MHC. These and other options are currently under discussion and ultimately each group may take a different approach.
Nomenclature in species with variable MHC gene content
Analyses of MHC haplotypes in several species have revealed considerable intraspecies differences with respect to gene content. Such differences were mainly found either in the MHC class I region, e.g., in mice (Kumanovics et al. 2002) and rats (Roos and Walter 2005), in the class II region such as in humans (Stewart et al. 2004), or could be attributed to both regions, for example, in rhesus macaques (Daza-Vamenta et al. 2004; Doxiadis et al. 2003; Otting et al. 2005) and cattle (Ellis and Ballingall 1999). Species with a large number of class I genes appear to be prone to large differences in gene content with the presence and absence of whole gene subfamilies (Roos and Walter 2005). Such gene content differences cause potential problems with respect to nomenclature and accurate assignment of genes and alleles. Thus, complete sequences or at least detailed physical maps of MHC haplotypes are required, yet are not—and may never become—available for most species.
To ensure proper naming of new genes and alleles in those cases where complete haplotype sequences are not available, we propose that genes should be designated according to physical map position if available, or phylogenetic relationship where this gives a clear indication of relationship. In those cases where a physical map is available, for example in the rat, class I genes may be designated according to their localization in class I gene clusters, e.g., class I genes mapping to the RT1-CE region should be named RT1-CE and should receive a consecutive number, if they do not represent alleles of the already known genes RT1-CE1 to RT1-CE16. In cases where the fine mapping is unknown, phylogenetic tree reconstructions might help to properly assign genes and alleles or to define new subfamilies. In species with less complex haplotypes but where designation of genes and alleles is nonetheless problematic, an alternative approach is to simply prefix all alleles with ‘N’ to indicate ‘not assigned’ as in sheep and cattle.
The International Society for Animal Genetics/International Union of Immunological Societies-Veterinary Immunology Committee Comparative MHC Nomenclature Committee has attempted in this report to establish a more robust common framework and guidelines for MHC nomenclature in any species, however, flexibility is essential. It is clear that constraints placed by the wide range of available data (sequence, mapping, and haplotype) will inevitably result in different species groups adopting slightly different ‘rules’ and approaches; this is acceptable as long as all fall broadly within the common guidelines. Each species group remains essentially independent, and IPD-MHC will reflect the decisions made by each nomenclature committee. It is intended that IPD-MHC should be developed further to broaden its role of disseminating information, for example, by including additional analysis tools. The data held should be used as constructively as possible with the ultimate aim of facilitating comparative studies. To this end specific funding will be sought in the near future. In addition, it is intended that further meetings of the committee will be held to continue discussion and encourage further sequence submissions to IPD-MHC.
The Comparative MHC Nomenclature Committee would like to thank IUIS-VIC for funding the meeting that led to this report and the University of Glasgow for hosting the meeting.