Introduction

Cells of the innate immune system play a wide range of roles in immune defense, inflammation, and tissue regeneration, including first-line defense against infections and shaping of subsequent adaptive immune responses. Their activation is regulated by a large number of receptor–ligand interactions, including a series of paired receptors. These are closely related receptors encompassing inhibitory and activating members that are encoded by genes lying in close physical proximity to each other (reviewed in Lanier 2001; Yamada and McVicar 2008). Particular attention has been paid to the paired receptors referred to as killer cell lectin-like receptors (KLR) and killer cell immunoglobulin-like receptors (KIR) which regulate cell activation and cytotoxicity in NK cells. Despite providing opposite signaling functions, several KLR and KIR exhibit almost identical ligand binding domains. Subtle sequence differences between the partners confer in some cases dissimilar ligand specificities (Biassoni et al. 1997; Vales-Gomez et al. 1998); in other cases, the partners may share ligands (Kaiser et al. 2005; Nakamura et al. 2004; Naper et al. 2005). The KLR and KIR are non-rearranging, i.e., they are fully encoded by the germ-line and hence the products of selective forces operating exclusively during phylogeny. Some KLR and KIR ligands are fast-evolving MHC class I molecules. In chasing common evasive ligands, the opposing partners must move in step with respect to ligand binding properties while at the same time conserving their opposite signaling functions, raising the question of molecular mechanisms enabling this feat.

The KLR and KIR belong to two separate superfamilies: the C-type lectin (CLSF) and the immunoglobulin (IgSF) superfamilies, respectively. Their basic structures, however, are similar in that they are both single-pass transmembrane (TM) proteins consisting of three distinct functional parts: (1) the ligand binding domain (LBD) consisting of the lectin-like domain for the CLSF receptors and the Ig domains for the IgSF receptors. It is this part that needs to undergo rapid evolution in step with receptor partners; (2) The external membrane-proximal stalk. A pronounced sequence variation between different receptor subgroups indicates relaxed sequence constraints and accordingly modest selection pressures on this part; (3) the signal transduction part (STP) comprising the cytoplasmic tail and the TM region. If the opposing signaling functions are to be maintained, key features of this part of the molecule must be conserved. For the inhibitory variants, these features are represented by immunoreceptor tyrosine-based inhibition motifs (ITIMs) in the cytoplasmic tail. The activating variants lack ITIMs but are instead associated with TM adaptor proteins carrying cytoplasmic immunoreceptor tyrosine-based activation motifs (ITAMs) (reviewed in Renard et al. 1997). In most cases, the key component of this association is ionic bonding between a positively charged amino acid residue in the TM domain of the receptor and a negatively charged residue in the TM domain of the adaptor (Lanier et al. 1998; Smith et al. 1998).

Based on phylogenetic comparisons of KIR and KLRA receptors dissected in this way, it was concluded that the activating variants are short-lived, recurrently evolving from inhibitory receptors (Abi-Rached and Parham 2005). Continuous neogenesis of activating receptors on the basis of being inhibitory would in principle explain the near-identical LBDs of opposing partners. We here subject receptors from other regulatory leukocyte receptor families to similar studies, with main emphasis on the KLRC receptor family. Both in rodents and in primates, our analysis demonstrates that the opposing KLRC variants existed before speciation and have undergone concerted evolution with post-speciation gene homogenization restricted to the LBDs. Gene conversion is shown to play a role in this process. We propose the term merohomogenization (from Greek meros—part) for this phenomenon, which is also demonstrated for the KLRI, KLRB, and PIR receptor families and is likely for members of the KLRA receptor family. On the basis of these results, we postulate that merohomogenization represents a mechanism for generating and maintaining repertories of opposing receptors chasing rapidly evolving ligands.

Results

Intraspecific sequence similarities of rodent KLRC receptors

Both in the mouse and the rat, the KLR gene subfamily KLRC contains three members, Nkg2a (Klrc1), Nkg2c (Klrc2), and Nkg2e (Klrc3) (Berg et al. 1998; Lohwasser et al. 1999; Vance et al. 1998). Exon 1 encodes most of the cytoplasmic region, exon 2 the remaining cytoplasmic part plus the TM region, exon 3 the membrane-proximal stalk, and the last three translated exons (4–6) the lectin-like domain that contains the ligand binding site (Kaiser et al. 2005; Natarajan et al. 2002). In addition, the 3′ untranslated regions (3′-UTR) are encoded by more than one exon. The NKG2A receptors have cytoplasmic tails containing two ITIMs and are inhibitory, whereas NKG2C and –E lack ITIMs and instead form activating receptors together with CD94 and DAP12.

In both species, the genes are situated between Klrk1 and Klri1 (Saether et al. 2005) in the middle part of the NKC on mouse chromosome 6 and rat chromosome 4, with gene order Nkg2a → Nkg2c → Nkg2e. Conservation of chromosomal positions, order, orientations, and approximate intergenic distances indicates orthology. The alignment of the predicted amino acid sequences (Fig. 1) shows that the parts encoded by the first three rat Nkg2a (rnNkg2a) exons are more similar to the corresponding exons of the presumed mouse ortholog Nkg2a (mmNkg2a) than to the rnNkg2c and rnNkg2e paralogs. For the exons encoding the LBD, however, rnNkg2a is more similar to the rat paralogs than to the mouse ortholog, as is mmNkg2a to the mouse paralogs. Furthermore, throughout the sequence, both rat and mouse Nkg2c and Nkg2e exhibit greater similarity within (intraspecific) than between (interspecific) species. These patterns become clearer by exonwise comparisons of informative nucleotide substitution sites (Fig. 2) and by visualization of relatedness by dendrograms (Fig. 3), both methods demonstrating the striking shift in patterns of inter- versus intraspecific similarities between the first and the last three coding exons of these genes.

Fig. 1
figure 1

Amino acid sequences of rat (r) and mouse (m) KLRC receptors. Dashes indicate identity with top sequence (rat NKG2E); points indicate gaps. Numbers on top of alignment show the start site of corresponding exons. Cytoplasmic ITIM are boxed; additional ITIM-like motifs are boxed with broken lines. Putative TM regions are underlined. Below the alignment are indicated conserved secondary motifs in the CLSF lectin-like domains (e.g., α1—α-helix 1, β1—β-strand 1, L1—loop 1)

Fig. 2
figure 2

Exonwise comparisons of substitution sites in KLRC genes. a Mouse (mm) versus rat (rt), b human versus chimpanzee (pt), and c human (hs) versus rhesus monkey (rm). Nucleotide sequences for each exon were aligned and sites were removed if one of the following conditions was fulfilled: (1) all bases were identical, (2) only one of the genes differed from the others, (3) one or more of the genes exhibited gaps. The bases were replaced by symbols, where filled circles indicate the identity with the top sequence and the other symbols (open circles, red squares, blue inverted pyramids) the non-identical bases. Above the sequences are shown exon numbers, and in a also the numbers of compared bases. For example, 115 bases were compared for exon 6. At 92 sites, all six were identical and at ten sites only one of the six deviated from the others. These 102 sites were omitted, leaving the 13 sites shown. a The truncated exon 3 of mmNnkg2e shown in Fig. 1 (reported in Vance et al. 1998) may represent a splice variant; in the mouse genome sequence mmNkg2e was found to contain a sequence identical to exon 3 of mmNkg2c, as shown here. The alignment for exon 5 has been extended ~100 nt into intron 5 (@|in5) to demonstrate the changing affiliation of mmNkg2e (see “Results”). b The sequence referred to as ptC is based on the ptNKG2CI 01 allele (Khakoo et al. 2000). ptNKG2CII has, for clarity, been omitted. Single asterisk, hsNKG2F and ptNKG2F lack exons 5 and 6. b, c Two asterisks, as first described for hsNKG2E (Adamkiewicz et al. 1994) this gene as well as ptNKG2E and rmNKG2FE have insertion–deletion affecting the last ~20 nt of exon 6, resulting in sequence alteration of the conserved β5-strand, including the loss of the cysteine involved in disulphide bonding to the α1-helix. These sites are not shown

Fig. 3
figure 3

Exonwise comparisons of rodent and primate KLRC nucleotide sequences. a Mouse (m) versus rat (r), b human (h) versus chimpanzee (p), c human versus rhesus monkey (m), d human (h) versus rat (r). A, C, E, and F refer to NKG2 family member names. Presumed or proven inhibitory receptors (with ITIMs) shown in red and with minus sign; receptors without ITIMs are in blue and with plus sign. Background colors indicate species. Exon 1 at the bottom and exon 6 at the top. The lengths of the vertical lines are proportional to nucleotide differences

The intraspecific sequence similarities result from gene homogenization

The close intraspecific sequence similarities between the LBD of the three KLRC genes are not restricted to the coding parts but encompass intronic as well as intergenic sequences covering a chromosomal region of ~52 kb in the mouse and ~43 kb in the rat, which in both species contain all six exons of Nkg2c and –e and exons 4–6 of Nkg2a. The higher intraspecific than interspecific similarities could be explained if the corresponding genes are not true orthologs but originally were paralogs belonging to two ancestral Nkg2 gene clusters, one later lost in the mouse and the other in the rat, as discussed for the rodent versus human gene clusters (Vance et al. 1999). Orthology is indicated, however, by the conservation of chromosomal positions as well as the close interspecific sequence identities of 84.3% and ~86% between the upstream and downstream flanking sites, respectively (not shown), which is comparable to the average synonymous substitution rate of 13.7 ± 0.8% of orthologous rat–mouse gene pairs (Li 1993). There are furthermore revealing exceptions to the rule of greater intra- than interspecific sequence similarities involving the large intron 4 of rnNkg2a (~ 2.8 kb) and of mmNkg2e (~1.8 kb). The major part of rnNkg2a intron 4 shows >80% sequence conservation with mmNkg2a intron 4, but no similarity to the rat paralogs, and the major part of mmNkg2e intron 4 shows sequence conservation with rnNkg2e, but no similarity to the mouse paralogs. Both introns are flanked by sequences showing higher intraspecific than interspecific similarities (not shown), demonstrating that this phenomenon must have other causes than converse loss of duplicated gene clusters.

Given orthology, the intraspecific sequence similarities of the LBD could have arisen by parallel evolution, assuming selection pressure for binding of ligands common to the activating and inhibitory receptors and interspecific divergence of these ligands. However, 13 of the sites which exhibited interspecific difference but intraspecific identity for all three genes in both species represented synonymous substitutions. As these would be invisible to selection, they are too many to have occurred by chance independently of each other in all three genes. Furthermore, as mentioned, the similarities extend into intronic and intergenic regions. The greater intraspecific than interspecific sequence similarities therefore bears evidence of concerted evolution of the repeats (Elder and Turner 1995). Importantly, for NKG2A versus NKG2C/NKG2E, i.e., receptors of opposite signaling functions, homogenization was restricted to the exons encoding the LBD.

Evidence of homogenization by gene conversion

The two major mechanisms for gene homogenization are contractions/expansions through multiple unequal crossing-over events and gene conversion through non-reciprocal sequence transfer from one chromatid to another through heteroduplex formation (reviewed in Szostak et al. 1983; Trowsdale and Parham 2004). A difficulty with the former mechanism in explaining the observed patterns is that it invokes several independent crossing-over events with highly similar end results in both species. Furthermore, it is a relatively crude mechanism, engaging large, continuous chunks of DNA, making discontinuous homogenization patterns difficult to explain. To elaborate on the examples of discontinuity referred to above, homogenization of rnNkg2a with the other two rat genes starts ~50 nt upstream of exon 4 and continues for only ~250 nt, to stop ~50 nt downstream of the exon. It is then resumed at the other end of intron 4, ~100 nt upstream of exon 5. Even more revealing is mmNkg2e, where homogenization with mmNkg2a is restricted to three separate patches of a few hundred nucleotides each, covering the whole of exon 4, the major part of exon 5, and the whole of exon 6, but with the intervening sequences more similar to the rat ortholog (Fig. 2a). The patchy homogenization patterns involving short DNA regions provide evidence that gene conversion has contributed to the homogenization processes.

Comparisons of human KLRC genes with other species indicate timescale of events

Human versus chimpanzee

In human, there are four non-allelic NKG2 genes, NKG2A, −C, −E, and −F, and in the chimpanzee (Pan troglodytes) there are five, named as in the human, except that NKG2C apparently is duplicated (Khakoo et al. 2000; Shum et al. 2002). NKG2A has ITIMs and is inhibitory; the others have no ITIM and are probably activating (pt- and hsNKG2F and ptNKG2CII seem to be pseudogenes). For exon 1, the four genes of the two species sort neatly into orthologous pairs (Figs. 2b and 3b) (ptNKG2CII exhibits close sequence similarity to ptNKG2CI, apart from an inactivating frameshift mutation early in exon 4. To simplify the presentation, ptNKG2CII has therefore not been included in the alignments). For exons 1–3, NKG2C, −E, and −F exhibit close sequence similarity and differ markedly from NKG2A. For exons 4–6, however, NKG2C is similar to NKG2A, whereas NKG2E is more distant, indicating A/C homogenization, but with the event occurring before speciation as the pattern is the same in the chimpanzee and the human. The exception is hsNKG2C exon 5, which follows −E rather then −A. The changing affiliations of hsNKG2C reflect patchy homogenization and are therefore most likely due to gene conversion.

Human versus rhesus monkey

In the rhesus monkey (Macaca mulatta), several KLRC cDNAs have been reported (Labonte et al. 2000, 2001). Many of these seem to be splice variants, leaving four probably non-allelic genes, encoding the NKG2A receptor with ITIMs and the −C, −C2, and −EF receptors without. (To avoid confusion with the mouse sequences, the rhesus monkey genes are referred to as rmNKG2A, etc.) Sequence comparisons of exons 1–3 indicate that rmNKG2A/hsNKG2A and rmNKG2FE/hsNKG2F represent orthologous pairs. For the other two genes, sequence evidence of orthology has probably been erased by homogenization. In the case of exons 4–6, homogenization has clearly taken place. Here all the rhesus monkey genes are more similar to their intraspecific paralogs than to their presumed human orthologs and vice versa (except for the deviant sequence of hsNKG2E exon 6 (Figs. 2c and 3c)). A prominent feature is the shifting patterns with respect to the exons that exhibit the closest sequence similarities. An example, in addition to hsNKG2C pointed to above, is rmNKG2C, which is identical with −C2 for exon 4 and with FE for exons 5 and 6 but which lacks the insertion–deletion affecting the last ~20 nt of exon 6 in rmNKG2FE, hsNKG2E, and ptNKG2E (see figure legend of Fig. 2).

Human versus rodents

Finally, comparisons between human and rodents show that the intraspecific similarities exceed the interspecific ones for all six exons (Fig. 3d). As previously suggested, this may reflect differential loss of duplicated gene clusters (Vance et al. 1999). More likely, it is evidence of gene homogenization extending into exons 1–3 of Nkg2a versus Nkg2c and –e.

Evidence of gene homogenization in other regulatory leukocyte receptor families

The KLRI receptors

The rodent Klri1 and Klri2 genes are situated in the middle part of the NKC between Klrh1 and Nkg2a (Flornes et al. 2010; Saether et al. 2005). KLRI1 is ITIM-bearing and inhibitory, whereas KLRI2 lacks ITIMs, has a transmembrane arginine, and is activating (Saether et al. 2008). The exon–intron structure is as described for the KLRC genes. The gene order, orientations, and intergenic distances are conserved between the mouse and rat genes, indicating that they represent true orthologs (Saether et al. 2005). As expected of orthologous genes, the first three mmKlri1 exons are more similar to the corresponding rnKlri1 exons than they are to the rnKlri2 exons and vice versa (Fig. 4a). For the exons encoding the lectin-like domain, however, the KLRI genes display the same pattern of greater intra- than interspecific sequence similarities as the KLRC genes. This is particularly striking in the mouse, where the two genes exhibit nearly identical sequences for all three corresponding exons (Fig. 4a).

Fig. 4
figure 4

Exonwise comparisons between mouse and rat KLRI, KLRB and PIR receptors. Mouse sequences indicated with m and red background; rat sequences indicated with r and blue background (coding exons only). For KLRI (a) and KLRB (b), the comparisons were made at the nucleotide level between each of the six coding exons. Exon 1 at the bottom and exon 6 at the top. For PIR (c), the comparisons were made at the amino acid level between the STPs (left) and the six Ig loops of the LBDs (right). Red and blue letters and +/− signs as in Fig. 3. a 1—KLRI1 is inhibitory, 2—KLRI2 is activating. b A, B, C, D, E, and F indicate NKR-P1 family member. c A1 through A6 denote the corresponding, presumed activating, PIR-A receptors; B—the ITIM-bearing PIR-B receptor

The KLRB receptors

The rodent KLRB genes encode the NKR-P1 receptor family and are situated at the centromeric end of the NKC. Intermingled with the KLRB genes are the Clr genes, which encode the ligands for the NKR-P1 family members. In the mouse, there are five genes: mmNkrp1a, −b/d (−b and −d are alleles), −c, −f and −g. In the rat, there are four: rnNkrp1a, −b, −f and −g (for recent revision of the rat nomenclature, see (Kveberg et al. 2009, 2011)). The genes cluster in two groups: the Nkrp1a and −b (including −c in the mouse) and the −f and −g groups. In both species, Nkrp1a (and −c) and −f encode activating receptors, −b and −g inhibitory (the functions of the −f and −g receptors predicted from sequence features, not yet formally shown) (Kveberg et al. 2009). As described in Kveberg et al. (2011), the two Nkrp1f genes are orthologs, as are the two Nkrp1g genes. The definition of orthologs between the mouse and rat Nkrp1a, −c, and −b loci is more difficult (Kveberg et al. 2009). Exonwise sequence alignments revealed close interspecific sequence similarities for the first three exons, as expected for true orthologs (Fig. 4b). For the exons encoding the lectin-like domain, homogenization has erased evidence of orthology, similar to the KLRC and KLRI genes. The Nkrp1f and −g pairs, in contrast, have resisted homogenization, with the orthologs exhibiting a close interspecific sequence similarity for all six exons.

The PIR receptor family

The paired immunoglobulin-like receptors, first identified as a family of opposing receptors in the mouse (Kubagawa et al. 1997), are expressed by B cells and myeloid cells (Takai 2005) and recognize MHC class I molecules (Nakamura et al. 2004). They are type I transmembrane proteins with six external Ig-like domains. Both in the mouse and in the rat (as predicted from the rat genome sequence), there are several activating receptors, referred to as PIR-A, with an arginine in the transmembrane region and no ITIM, but only one inhibitory receptor, referred to as PIR-B, with uncharged transmembrane region and three cytoplasmic ITIMs with an inhibitory capability (Uehara et al. 2001). In both species, the PIR receptors are encoded by the leukocyte receptor gene complex on mouse chromosome 7 and on rat chromosome 1 (Hoelsbrekken et al. 2003; Takai 2005). The results of sequence comparisons of the PIR receptors are visualized by the dendrograms in Fig. 4d. Whereas the inhibitory and activating KLRreceptors have the same intron-exon structure, the activating members of the KIR and PIR receptor families have much shorter cytoplasmic tails than their inhibitory partners. To simplify the presentation, the cytoplasmic tail and the transmembrane region of each receptor were grouped, as were the six IgSF domains, into an LBD. The STPs of the rat and mouse inhibitory receptors show close interspecific sequence similarities, as do the STPs of the activating receptors. In contrast, in both species, the LBDs of the PIR-A and PIR-B receptors exhibit close intraspecific sequence similarities. Particularly striking is the almost complete identity between mmPira1 and mmPirb, bearing evidence of recent homogenization of the exons encoding the LBDs.

The KLRA receptors

The rat KLRA gene family contains 34 Ly49 (Klra) loci, of which 26 may encode functional receptors (Nylenna et al. 2005). Pronounced haplotype diversity indicates that not all of them will be present and/or functional within a single animal. Moreover, for the same reason, additional Ly49 loci may exist within haplotypes not yet investigated. Thirteen of the genes encode predicted inhibitory receptors (with a cytoplasmic ITIM), eight encode activating receptors (with a transmembrane arginine residue), and five encode “bifunctional” receptors (carrying both sequence features). In the mouse, only about half as many Ly49 genes have been reported so far, with a predominance of inhibitory receptors. In both species, the Ly49 genes are regularly spaced throughout the distal half of the NKC with no other intervening genes. In the rat, this region can be divided into three major subregions or blocks, each containing one subfamily of related Ly49 genes (with Ly49p8 and −s2 intermingled with the block II genes), and with three single gene subfamilies falling outside the blocks. By sequence comparison of the mouse and rat genes, only one clear-cut orthologous pair can be identified, mmLy49B and rnLy49i8, the most telomeric gene in both species (Nylenna et al. 2005).

Most rat and mouse Ly49 genes form subgroups with close intraspecific sequence similarities throughout the genes. This could be due to post-speciation gene duplications, gene homogenization, or both. Birth-and-death evolution (reviewed in Nei and Rooney 2005) has most likely played a major role in shaping this genetic region, but evidence of mechanisms akin to those described above emerges with exonwise gene comparisons (Fig. 5) (to simplify the presentation, the STP encoding exons 2 (cytoplasmic) and 3 (TM) have been grouped, as have the LBD encoding exons 5–7). When comparing exons 2–3, the inhibitory receptors cluster separately from the activating and bifunctional ones. When comparing exons 5–7, however, pairs of inhibitory and activating members are formed (or bifunctional instead of activating members). The almost complete sequence identities for many of the pairs are telling signs of recent recombination or homogenization events.

Fig. 5
figure 5

Exonwise comparisons between selected rat and mouse KLRA (Ly49) receptors. Mouse sequences m and red background; rat sequences—r and blue background. The comparisons were made at the amino acid level, with sequences grouped into STPs (cytoplasmic and transmembrane domains encoded by exons 2 and 3), the stalk (encoded by exon 4), and the lectin-like domain (encoded by exons 5–7). Red—ITIM-bearing receptors, blue—non-ITIM-bearing and transmembrane arginine, black—receptors with both features (“bifunctional”). Branches with roman numbers IIII indicate rat receptors encoded by chromosomal segments (blocks) I–III (see text). Arabic numbers indicate bootstrapping values in percent (1,000 iterations). It should be noted that the STP of the activating variants form three separate branches and persistently do so with variation of parameters when tested both at the amino acid and at the nucleotide level and when the exons encoding the cytoplasmic tail and the TM are analyzed separately or together

Discussion

Among the first rat Ly49 receptors reported, some had nearly identical extracellular lectin-like domains but at the same time remarkably different cytoplasmic regions, proposed to serve different signaling functions (Dissen et al. 1996). Similar observations were also reported for other families of leukocyte receptors (Biassoni et al. 1996; Lazetic et al. 1996; Mason et al. 1996) which were shown to mediate either activating or inhibitory signals and became referred to as opposing receptors. Arase and Lanier (2002) ascribed the phenomenon to birth-and-death evolution and proposed that activating receptors are continuously created from inhibitory ones as a result of pathogen-driven selection to counteract the microbial exploitation of the inhibitory variants. The hypothesis received support by the phylogenetic analyses of KIR and Ly49 receptors, based on apparent monophyly of the activating receptors (Abi-Rached and Parham 2005). We here provide evidence that concerted evolution may play a central part in this phenomenon. A separate documentation of concerted evolution from birth-and-death evolution requires interspecific comparisons and the establishment of orthologous relationships between genes. For this reason, the small KLRC family was selected for a closer scrutiny, with a main focus on the rat and the mouse. Here orthology could be established both for the activating and for the inhibitory KLRC. The close interspecific similarities of the signaling domains bore evidence that both variants existed before speciation, with the corollary that the intraspecific LBD similarities are products of gene homogenization. Although the events leading to homogenization have occurred independently in the rat and the mouse, the end results were strikingly similar, with homogenization between NKG2A, −C, and −E exclusively of the exons encoding the LBDs. Because it is restricted to parts of genes, we have termed this phenomenon merohomogenization. The phenomenon was furthermore demonstrated for KLRC genes in primates and for other KLR subfamilies and the PIR family in rodents. As shown in Flornes et al. (2004) it has also occurred between the DCIR (dendritic cell immunoreceptor) and DCAR (dendritic cell immunoactivating receptor) receptors of the rat and mouse APLEC (antigen presenting cell lectin-like receptor gene complex).

The patchy homogenization patterns observed for the LBDs of rodent KLRC receptors could in principle be explained by a convoluted chain of multiple recombination events but is more easily explained by gene conversion which, depending on selective constraints, seems to be a particularly suitable mechanism for restricting co-evolution to parts of the genes (Shyue et al. 1994). Most likely, both gene conversion and recombination contribute to merohomogenization, but their relative importance cannot be determined from the present material. Physical proximity is expected to be important but is by itself not sufficient. The CD28/CTLA-4 (CD152) receptors constitute an opposing pair encoded by genes lying close together on mouse chromosome 1 and on rat chromosome 9. Their LBD, a single IgSF V domain, shows ~93% sequence identity between the orthologs, but only ~30% identity between the paralogs in the rat and the mouse. They are thus an example of an opposing pair where the LBDs have not been homogenized. Previously, most multigene families were thought to be subject to concerted evolution (reviewed in Nei and Rooney 2005). If homogenization represents the default development, resistance to it would be the feature requiring explanation. To the extent there are no differential constraints on the partner’s LBDs, merohomogenization would then emerge as a natural consequence of the need to preserve only the opposing signaling functions. In the case of CD28/CTLA-4, CTLA-4 binds with higher affinity than CD28 to their shared ligands (Peach et al. 1994; van der Merwe et al. 1997). Provided that it is functionally important to preserve this difference, this pair may have resisted homogenization of their LBD through purifying selection.

This does not explain why the exons encoding the external membrane–proximal stalk have not been homogenized, indicating that the selective homogenization of the LBD needs elucidation. A striking difference between CD28/CTLA-4 versus PIR and Ly49 is that the motifs mediating ligand binding for the former pair are evolutionarily highly conserved, whereas the ligands for PIR and Ly49 are fast-evolving MHC-class I molecules (Nakamura et al. 2004; Naper et al. 2005). Despite >300 million years (Myr) of separate evolution, a chicken homologue of the CD28/CTLA-4 ligands CD80 and CD86 binds to mammalian CTLA-4 (O'Regan et al. 1999). In contrast, although the human CD94/NKG2 ligand HLA-E is unusually well conserved for an MHC class I molecule (Knapp et al. 1998), the interaction surfaces on CD94/NKG2 and its HLA-E ortholog ligands have diverged sufficiently during the ~80 Myr separating primates and rodents to prevent cross-species binding (Miller et al. 2003; Springer et al. 2003).

Apart from CD28/CTLA-4, it is not clear why the partners of opposing pairs should recognize similar or shared ligands. That they do has been shown for the KLRC and PIR families (Kaiser et al. 2005; Nakamura et al. 2004) and rat Ly49 pairs (Naper et al. 2005). The “inhibitory receptors first” hypothesis (Arase and Lanier 2002) was referred to above. The converse hypothesis postulates primary roles for activating receptors in combating infections, with secondary generation of inhibitory partners to prevent autoimmunity. A third possibility is that the recognition of the same ligand by opposing receptors permits more accurate ligand density estimations than either receptor type could do alone. Whatever the reason, for continuous sharing of rapidly evolving ligands, the receptor pairs face a problem of conflicting selection pressures: at one end, the partners must evolve in step when chasing the ligand(s), while at the other end their opposite signaling functions must be conserved. In this context, merohomogenization emerges as a solution. A case in point is the KLRB receptors where we recently showed that the NKR-P1G/–F and the NKR-P1A/–B receptors represent two different subsets recognizing different Clr ligands (Kveberg et al. 2011). Within both subsets, the two opposing receptors share ligands. Notably, whereas the rat and mouse NKR-P1G/–F pairs demonstrate cross-species conservation of specificity for the Clr ligands, the rat NKR-P1A/–B pair did not, indicating that the binding site(s) recognized by the former receptor pair has been preserved between the two species, but not the site(s) recognized by the latter pair. We suggest that the dissimilarity in receptor binding site preservation explains the difference with respect to LBD homogenization between the two pairs. We should add that their recognition of different ligands also explains why homogenization between the two subsets has been resisted.

As for timescale of events, we are currently limited to the study of gene homogenization based on species comparisons, permitting only crude estimates of occurrence in the population and rates of fixation. Following speciation, the paralogs would be expected to drift apart at roughly the same rate as the orthologs, until homogenization events occur. That the paralog sequences do differ from each other provides evidence that the latter events occur relatively rarely compared with single base substitutions (note that the alignments in Fig. 2 underrate sequence differences as only informative substitution sites are shown—confer figure legend). At the level of the whole genome, the human and the chimpanzee exhibit ~1.4% single nucleotide substitution differences, corresponding to ca. 1 substitution per 100 nucleotide per 4 million years (Watanabe et al. 2004). Due to the requirement that the differences between paralogs must be significantly lower than between the orthologs, post-species homogenization events involving only short stretches in species that have split recently will be concealed. On this background, the shifting alliances of the three exons encoding the LBD of hsNKG2C are noteworthy (Figs. 2 and 3). As this shifting pattern is not exhibited by ptNKG2C, it is most likely due to a gene conversion event involving exon 5, but not the two flanking exons and with hsNKG2E as donor. Evidence for post-speciation homogenization of the human gene was substantiated by genomic alignment, where a large part of intron 4 and exon 5 of hsNKG2C exhibited 100% sequence identity with hsNKG2E, but only ~92% identity with ptNKG2C. In the human/ chimpanzee comparisons (divergence time ~6 Myr), we have thus provided evidence of incipient post-species homogenization of the KLRC LBD and in the rat/mouse and human/rhesus monkey comparisons (divergence times ~16–23 Myr; Adkins et al. 2001; Springer et al. 2003) post-species homogenization of the entire LBD.

The timescales may be interpreted on the background that, following homogenization events, selection for renewed homogenization is only expected when sufficient mutations have occurred so that the ligand is no longer seen by both partners. Non-rearranging immunoreceptors will inevitably lag behind rapidly evolving ligands and more so if it is required that the ligands are seen by both partners of an opposing pair. Although merohomogenization would shorten this additional time lag, we would expect cases where the partners have not yet caught up with each other. The inevitable time lags may be one reason why this recognition system exhibits such a profusion of apparently redundant receptors. It is furthermore striking that, of the CLSF receptors we have described here, only the KLRC family recognizing a relatively slowly evolving MHC class I ligand still exists as opposing pairs in primates. Maybe the inevitable time lags have made these receptor systems so vulnerable to extinction that the other families have only survived in rodents where the generation time is measured in weeks rather than years.

The somatic recombination driven by RAG1 and −2 (recombination–activation genes) focuses directly on the antigen receptors of the adaptive immune system in jawed vertebrates and represents the ultimate sophistication in generating receptor repertoires. This mechanism permits the anticipation of antigens not previously encountered in evolution. In birds and rabbits, gene conversion contributes to the process, catalyzed by the enzyme activation-induced cytidine deaminase (AICD). AICD-related enzymes have also been implicated in the recombination process whereby jawless vertebrates generate variable lymphocyte receptors (reviewed in Litman et al. 2010). The described homogenization systems based on germ-line recombinations and gene conversion events are comparatively far more simple but may have represented primitive stepping stones towards the complex systems for antigen receptor generation in the adaptive immune system. Whether there are mechanisms favoring the recombination or gene conversion events behind merohomogenization is unknown, but it may be more than a coincidence that the AICD gene is situated close to the NKC (and APLEC), a position conserved between rodents and the human.

In conclusion, we have here demonstrated an unusual form of concerted evolution exhibited by pairs of opposing regulatory leukocyte receptors, with homogenization primarily restricted to the exons encoding the ligand binding domains. Its widespread occurrence indicates that this mechanism plays an important adaptive role.

Methods

Cloning and sequencing of rat NKG2E and rat and mouse NKR-P1G cDNA

Rat NKG2E and rat and mouse NKR-P1G cDNA were cloned before the availability of the rat genome sequence assembly. They were identified as previously described (Flornes et al. 2010; Naper et al. 2002) by

  1. 1.

    Searching the GenBank rat Trace archive and the EST database for sequences homologous to the published mouse and human receptors, using the NCBI BLAST program.

  2. 2.

    Performing pairwise BLAST on recently released partially or fully sequenced rat BAC clones. Gene-specific (nested) primers in the 5′- and 3′-UTRs were generated from the predicted sequences.

Full-length transcripts for rat NKG2E were obtained by RT-PCR using mRNA (Dynabeads mRNA Direct kit, Dynal, Oslo, Norway) obtained from IL-2-stimulated NK cells isolated from PVG rat bone marrow and for mouse and rat NKR-P1G from NK cells isolated from C57BL/6 spleen and DA bone marrow, respectively. Three independent clones were fully sequenced on both strands. GenBank accession numbers: Rat NKG2E—AY197774; mouse NKR-P1G—DQ113419; rat NKR-P1G—DQ113420 (the latter two sequences originally deposited as NKR-P1E, but nomenclature was later adjusted as detailed in Kveberg et al. 2011).

Bioinformatics

Sequence similarity searches were performed with BLAST programs, run on the NCBI or Ensembl websites. The phylograms were constructed with NJ plot (Perriere and Gouy 1996) based on alignments generated by the ClustalX (Thompson et al. 1997) or the GCG Pileup programs (Accelrys Inc., San Diego, CA, USA).