Nomenclature for the KIR of non-human species

The increasing number of Killer Immunoglobulin-like Receptor (KIR) sequences available for non-human primate species and cattle has prompted development of a centralized database, guidelines for a standardized nomenclature, and minimum requirements for database submission. The guidelines and nomenclature are based on those used for human KIR and incorporate modifications made for inclusion of non-human species in the companion IPD-NHKIR database. Included in this first release are the rhesus macaque (Macaca mulatta), chimpanzee (Pan troglodytes), orangutan (Pongo abelii and Pongo pygmaeus), and cattle (Bos taurus).


Introduction
The KIR locus has been studied in a number of non-human species primates and is characterized by high levels of allelic polymorphism, haplotypic polymorphism in the number of genes, and extensive duplication and recombination (Hammond et al. 2016;Parham 2004). These factors have made it difficult to assign orthologues and have led to a number of different nomenclature systems being used to name genes and alleles. This report describes a common framework and guidelines for KIR nomenclature in non-human species. These have been developed by taking advantage of lessons learned in the development of a nomenclature system for the human KIR (Marsh et al. 2003).

General naming guidelines
To provide consistency with the IPD-MHC Database (Maccari et al. 2017), the non-human KIR nomenclature adopts the same four-character prefix used for species designation in the naming of MHC alleles (de Groot et al. 2012;Ellis et al. 2006;Klein et al. 1990). Also, genes and alleles will be named based on the conventions that have been adopted for the human KIR system (Marsh et al. 2003) that are based on the structures of the molecules they encode. The first digit following the KIR acronym corresponds to the number of Ig-like domains in the polypeptide and the BD^denotes BDomain.^The D is followed by either an BL^indicating a BLong^cytoplasmic tail, an BS^indicating a BShort^cytoplasmic tail or a BP^for pseudogenes. In addition, the inclusion of a BW^indicates BWorkshop^following the BL,^BS,^or BP^to indicate any sequence that by phylogenetic analysis is sufficiently divergent to be considered a Bnew^gene, but lack either genomic sequencing or family studies to demonstrate that it does define a new gene and not a divergent lineage a known gene. Tables 1, 2, and 3 list the current gene designations and their previous names. Symbols for genes are italicized (e.g., Mamu-KIR3DL01), whereas symbols for proteins are not italicized (e.g., Mamu-KIR3DL01). Alleles follow the same conventions as gene names.
Reflecting species-specific differences, there have been further additions/modifications to the general nomenclature for rhesus macaque and cattle. As with the human KIR nomenclature, alleles in each series have been named in order of their deposition into a generalist sequence databank, GenBank/EMBL-ENA/DDBJ (Benson et al. 2017;Chojnacki et al. 2017;Mashima et al. 2017). Where the identity is known of the animal providing the sequenced DNA, that information is included in the database, as well as information regarding the origin of the animal. Tables 4, 5, 6, and 7 provide a complete list of genes and alleles currently in the nomenclature, as well as the original name(s), accession number, and reference to the original report of the sequence.
Each KIR allele name includes a unique number corresponding to up to three sets of digits separated by colons. All alleles are given a three-digit name, which corresponds to the first set of digits; longer names are assigned only when necessary.
The digits placed before the first colon describe the alleles that differ at non-synonymous substitutions (also called coding KIR3DH-like_2, KIR3DH-like_3, KIR3DH-like_4, KIR3DH21, KIR3DSW08 Mamu-KIR3DSW09 KIR3DH-8, KIR3DH20, KIR3DH5, KIR3DH5-like1, mmKIR3DH-1 substitutions). Alleles that differ only by synonymous nucleotide substitutions (also called silent or non-coding substitutions) but are within the coding sequence are distinguished by their second sets of digits. Alleles that only differ by sequence polymorphisms in the introns, or in the 5′ or 3′ untranslated regions that flank the exons and introns, are distinguished by their third sets of digits. In addition to the unique allele number, optional suffixes can be added to an allele name to indicate the expression status of the gene and/or its encoded protein.
Alleles known not to be expressed-so called BNull^alleles-have been given the suffix BN.^Alleles that have been shown to be alternatively expressed may have the suffix BL,^BS,^BC,^BA,^or BQ.T he suffix BL^is used to indicate an allele that has been shown to have BLow^cell surface expression when compared to normal levels. The BS^suffix is used to denote an allele specifying a protein which is expressed as a soluble, BSecreted^molecule and is not present on the cell surface. The BC^suffix is assigned to alleles producing proteins that are present in the BCytoplasm^and not on the cell surface. An BA^suffix indicates an BAberrant^expression, where there is doubt as to whether a protein is actually expressed. A BQŝ uffix is used when the expression of an allele is BQuestionable,^given that the mutation seen in the allele has been shown to affect normal expression levels in other alleles and other KIR genes.
As of May 2018, no alleles have been named with the BC,B A,^BQ,^or BS^suffixes. A schematic representation of the syntax for the nonhuman KIR allele designation is shown in Fig. 1.

Species-specific guidelines
Naming rhesus macaque KIR genes The Mamu-KIR sequences fall into a number of distinct lineages based on phylogenetic analysis. Most sequences correspond to lineage II KIR and are further divided into those encoding KIR that have long cytoplamic tails or short cytoplasmic tails. The genes have been numbered sequentially and where possible the gene name has the same the same number as the first reported allele for that gene. For example, the Mamu-KIR3DL1 gene (Hershberger et al. 2001) was renamed Mamu-KIR3DL01*001.
The nomenclature uses a two-digit numbering of individual genes for the macaque sequences as seen with the naming of Mamu-KIR3DL01*001. This renaming aims to avoid confusion with previous sequence names. Subsequent analysis has shown that some of the proposed sequences of different genes are actually allelic variants of the same gene. Rather than skipping numbers to avoid confusion, it was thought better to introduce the two-digit numbering system.
Recombinant alleles are named according to the locus, which provide the majority of the sequence. For example,           (Sambrook et al. 2006) out of exon 4, also two Ig domain KIR variants are expressed. The majority of the rhesus macaque gene sequence appears orthologous to hominoid KIR3DL3 sequences, the exception being exon 3 [encoding the D0 domain] which appears more like the hominoid KIR2DL5 sequences. This sequence relationship coupled with the presence of splice variants that lacked exon 4 led to the naming of some of these sequences as Mamu-KIR2DL5. The presence of the intact gene as evidenced by the published genomic sequence, as well as the existence of full-length [three Ig domain containing] sequences has led us to propose naming this gene as Mamu-KIR3DL20. This distinguishes this gene from the remaining Mamu-KIR3DL as well as retaining the name of one of the first mRNA sequences that included all three Ig domain encoding exons, see Table 1 for further details. A full list of Mamu-KIR sequences is described in Table 4. The identification of sequences in other Macaque species will follow the same rules, and use the species prefix (Mafa-KIR, Mane-KIR), and that genes would be named to match the closest rhesus gene.

Naming chimpanzee KIR genes
Three studies (Abi-Rached et al. 2010;Khakoo et al. 2000;Sambrook et al. 2005) have described complete sequences of three chimpanzee haplotypes. In addition, the analysis of chimpanzee KIR genotypes has inferred the organization of genes infers the existence of another 17 chimpanzee KIR haplotypes. These analyses have defined 13 different Patr-KIR genes.
In all chimpanzee KIR haplotypes, the framework gene at the telomeric end is a lineage II KIR gene. Formerly, two variants, now known to occupy this position, were named Pt-KIR3DL1/2 and Pt-KIR3DL3. The name Pt-KIR3DL1/2 was given to reflect its close relationship to both human KIR3DL1 and KIR3DL2. Although segregation analysis showed that Pt-KIR3DL3 and KIR3DL1/2 were never present on the same haplotype, Pt-KIR3DL3 was given a different name because it has a distinctive sequence. We are renaming the Pt-KIRDl1/2 and Pt-KIR3DL3 as allelic variants of Patr-KIR3DL1, the new name for the framework gene at the telomeric end of the chimpanzee KIR locus. This will allow the Patr-KIR3DL3 name to be given to the gene previously known as Patr-KIRC1, and which is orthologous to human KIR3DL3, the framework gene at the centromeric end of the KIR locus. See Table 2 for further details. A full list of Patr-KIR sequences is described in Table 5.

Naming orangutan KIR genes
In the initial description of orangutan KIR cDNA , the sequences were given letter designations because their relationships, either alleles or genes, were uncertain. Subsequent studies (Guethlein et al. 2007a;Locke et al. 2011;Mager et al. 2001) have provided complete sequences of three orangutan KIR haplotypes, as well as genotyping data that has allowed the structures of two additional KIR haplotypes to be inferred. These genomic   data, in combination with the cDNA sequences, defined 11 KIR genes and 1 KIR pseudogene in the orangutan. At first, all orangutan KIR were named as BPopy^ (Guethlein et al. 2007b). The orangutan KIR is now divided into two series corresponding to the two species of orangutan: Popy for Pongo pygmaeus and Poab for Pongo abelii depending on species of origin. Some KIR alleles are present in both orangutan species. These alleles shared have been given a different name in each species Guethlein et al. 2015), see Table 3: for further details. A full list of Popy-KIR and Poab-KIR sequences is given in Table 6.

Naming cattle KIR genes
Assembly of the first cattle KIR haplotype allowed previously known cDNA sequences to be assigned to particular genes and allelic relationships to be defined (Dobromylskyj and Ellis 2007;Guethlein et al. 2007a;Hammond et al. 2016;Mager et al. 2001;Sanderson et al. 2014). This presents the opportunity to adopt an accurate and logical nomenclature system. Cattle KIR cDNA sequences were previously named using the established convention of Ig domain number and tail length. However, these alleles were annotated prior to the discovery of a second deeply divergent KIR lineage, the KIR3DX lineage (Guethlein et al. 2007a). The majority of the expanded cattle KIR belong to this second lineage. In developing a nomenclature system for the cattle KIR, we have incorporate their lineage ancestry within the name. Cattle KIR have been prefixed with a four-letter species designation BBota^(Bos taurus) in line with non-human primates.
Where possible previously named Bota-KIR has retained the same name with only the addition of an BX^after the domain number if from the KIR3DX lineage. There are three exceptions; Bota-KIR3DL1P and Bota-KIR3DL3, which are allelic, and Bota-KIR3DL2. These previously described cDNA sequences are all members of the KIR3DX lineage. Based on their position in the cattle haplotype and their relationships to other genes, Bota-KIR3DL1P was renamed Bota-KIR3DXL6*001N, Bota-KIR3DL3 was renamed Bota-KIR3DXL6*002, and Bota-KIR3DL2 was renamed Bota-KIR3DXL4. We have identified 16 cattle KIR genes. The proposed nomenclature for cattle KIR is given in Table 7.

Future guidelines
The sequences described in this report will be included in the Immuno Polymorphism Database (IPD) (Robinson et al. 2013). They will be maintained as a component of the IPD and be accessible at https://www.ebi.ac.uk/ipd/nhkir/. New sequences for any of the above species can be submitted using the current submission tool. As with the other databases, there are requirements that should be met before formal names can be given and the submitted KIR are included in the database. First, submission of full-length sequences is encouraged and for some species like rhesus macaque is already mandatory. Second, novel sequences must be confirmed, either through their replication in multiple individuals or at a minimum by coming from multiple independent PCR/cloning experiments. Full guidelines for submission of non-human KIR sequences to IPD can be found at https://www.ebi.ac.uk/ipd/nhkir/submission/help. As KIR sequence data from other species reaches the level of the species included in this report, those species can be included in the database. The inclusion of a species will be at the discretion of the Nomenclature Committee and IPD and will be based on the number of sequences available as well as evidence of identified genes and haplotype structure.