Introduction

One of the factors that determines whether a successful T cell response will be generated upon first encounter with an antigen is the availability of T cell receptors (TR) that interact with the target antigen. The potential T cell repertoire is determined by the number of V, D, and J genes that is available to take part in the process of rearrangement and by the extent to which coding region ends are shortened and nontemplate encoded nucleotides are added. Successfully rearranged α and β chains form αβ TR, and γ and δ chains are used by γδ TR.

Ruminants, chicken, and pigs have a high percentage of circulating γδ T cells, and these γδ T cells are known to be structurally and functionally more diverse than γδ T cells in mice and humans (Hein and Dudler 1993, 1997). Many authors suggest that the diversity and relative importance of αβ and γδ T cells is reversed in these so-called γδ high species and thus expect limited diversity of the TR α and β chains (Su et al. 1999). In fact, the number of V genes in the chicken TRA/TRD and TRB loci is limited and each locus contains only two subgroups of V genes (Gobel et al. 1994; Kubota et al. 1999; Tjoelker et al. 1990). Less is known about the diversity of ruminant αβ T cells. A survey of TR β chain transcripts does not suggest a limited TRBV gene repertoire in cattle, but instead multiple subgroups and multiple genes within subgroups were identified (Houston et al. 2005; Tanaka et al. 1990).

The six complementarity-determining regions (CDR) of the TR are the most variable parts of the TR and interact directly with the antigen-presenting element/antigen complex. The CDR1 and CDR2 are directly encoded by the V genes (germline encoded), while the CDR3 is encoded by the V–D–J junction which is formed during the process of rearrangement. In humans and mice, a comparable level of junctional diversity is present, and the number of V genes directly encoding the CDR1 and CDR2 is in the same range. The human TRA/TRD locus contains 49 TRAV genes, five TRAV/DV genes, and three TRDV genes, so a total of 57 V genes of which 49 are functional and lies on chromosome 14, spanning 0.9 Mb from the first TRAV till TRAC (Lefranc and Lefranc 2001; IMGT/GENE-DB, Giudicelli et al. 2005; IMGT Repertoire, http://www.imgt.org/textes/IMGTrepertoire/LocusGenes/tabgenes/human/geneNumber.html). The mouse TRA/TRD locus on chromosome 14 spans 1.6 Mb from the first TRAV till TRAC and contains 104 V genes, of which 78–89 are functional (Bosc and Lefranc 2003; Giudicelli et al. 2005).

The bovine T cell receptor β (TRB) and T cell receptor γ (TRG) loci have been described and are located on chromosome 4. The two TRG loci are located at 4q3.1 and 4q1.5–2.2 (Conrad et al. 2007), the TRB locus at 4q24 (Antonacci et al. 2001; Conrad et al. 2002), and the bovine TRA/TRD locus lies on chromosome 10 (Fries et al. 2001; Van Rhijn et al. 2007). The structure of the most downstream part of the TRA/TRD locus, containing TRDC and TRDV4, has been described in detail (Herzig et al. 2006), but the size of the locus and the organization and number of its V genes is unknown. Automated gene prediction methods resulted in 71 functional TRAV/DV genes and 51 TRDV1 genes in the Btau4.0 assembly of the bovine genome version (Elsik et al. 2009).

Like for sheep, multiple bovine TRDV genes, belonging to four TRDV subgroups, have been described (Herzig et al. 2006; Ishiguro et al. 1993; Van Rhijn et al. 2007). The artiodactyl TRDV1 subgroup is highly expanded compared to humans and mice (Antonacci et al. 2005). Hein and Dudler (1997) identified Vd1.1 till Vd1.26. Van Rhijn et al. (2007) identified Vd1.27 till Vd1.37. In addition to these 37 TRDV1 genes (identified as rearranged cDNAs), two TRDV2, two TRDV3, and one TRDV4 genes have been described (Herzig et al. 2006; Van Rhijn et al. 2007). Bovine TRAV genes have been described by Ishiguro et al. (1990), but no information on the number of genes and subgroups is available so far.

Using the Btau4.0 assembly of the bovine genome, we set out to describe the V genes of the bovine TRA/TRD locus and found that it contains a fourfold to fivefold higher number of V genes than humans and mice and contains V genes with extended CDR1 and very short or absent CDR2.

Materials and methods

Databases and searches

In order to find bovine TRAV genes an initial series of BLAST-Like Alignment Tool (BLAT) searches was performed in the bovine genome (Ensemble, Btau4.0 assembly version 52), using the transcripts of all human TRAV genes. (Partially) overlapping hits were joined, the hits in the TRB and TRG loci excluded, and from the resulting set of V genes, a preliminary phylogenetic tree was generated. One representative bovine V gene of 17 branches of this tree was used to perform a series of Basic Local Alignment Search Tool (BLAST) searches in the bovine genome. Also, the published TRDV1, TRDV2, TRDV3, and TRDV4 sequences were used to perform BLAST searches. After the removal of genes with overlapping genomic location, the V exons of the thus identified bovine V genes till the second cysteine (2nd-CYS 104, definition available at IMGT®, http://www.imgt.org) were downloaded and translated in silico to check for frameshift mutations or internal stop codons in the V exon. The nucleotide sequences of the bovine V exons were arranged into subgroups with 75–100% sequence identity. To check for V gene expression, expressed sequence tags and other cDNAs of TR α and TR δ chains were identified by performing BLAST searches with the constant region (TRAC and TRDC) of the TR α and TR δ chains or with individual V genes.

Software

Alignments were performed with ClustalW available at http://www.ebi.ac.uk/Tools/clustalw2. The circular phylogenetic tree in Fig. 2 was based on a ClustalW-generated alignment using iTOL (Letunic and Bork 2007) available at http://itol.embl.de. Translations were performed using the ExPASy translate tool (http://www.expasy.ch/tools/dna.html). Subgroup classification was determined using IMGT/V-QUEST (Brochet et al. 2008) available at http://www.imgt.org. Amino acid alignments were made in accordance with the standardized IMGT alignment scheme for human V genes (IMGT/DomainDisplay tool available at http://www.imgt.org/3Dstructure-DB/cgi/DomainDisplay.cgi) using the IMGT/DomainGapAlign tool (http://imgt3d.igh.cnrs.fr/3Dstructure-DB//cgi/DomainGapAlign-include.cgi). The IMGT/Collier-de-Perles tool (http://www.imgt.org/3Dstructure-DB/cgi/Collier-de-Perles.cgi) was used to create graphical representations of selected V regions (Kaas et al. 2007; Ruiz and Lefranc 2002).

Results

Numbers of genes and size of locus

Initial BLAT searches in the bovine genome using human TRAV sequences resulted in 217 bovine V genes. Subsequent BLAST searches with representative bovine V gene sequences among these 217 genes and with all known bovine TRDV genes resulted in a total of 402 bovine V genes. Some, but not all, previously described genes were 100% identical to a gene on this list. All novel genes were numbered 1–388 (Supplementary Table 1 of the Electronic supplementary material). The total number of previously described genes plus these novel TRAV and TRDV genes is 430. The fact that some previously described genes were not found in the genome in a 100% identical form is most likely due to polymorphisms. Most bovine V genes were found on chromosome 10, but also on contigs that had not yet been assigned to a chromosome (Fig. 1), and two on chromosome 21 (not shown). Even though these latter two genes did not contain internal stop codons or frameshift mutations that qualify them as pseudogenes, their location outside the TRA/TRD locus qualifies them as orphons. There are no homologs of TRDC genes on chromosome 21, so it is not possible that these two genes can be used in a functional TR α or δ chain. A comparable situation has been described for human TRBV genes on chromosome 9p (Robinson et al. 1993).

Fig. 1
figure 1

The bovine TRA/TRD locus. Map of the TRA/TRD locus on chromosome 10 and the three biggest contigs that have not yet been assigned to a chromosome. A complete list of the V genes and their exact locations are provided in Supplementary Table 1 of the Electronic supplementary material. Gaps with a size >45,000 bp are shown in gray. Red V genes, pink D genes, light blue J genes, dark blue METTL3 gene

Because the exact linear organization of chromosome 10 is not yet known, the provisional numbering of the genes does not reflect their order on chromosome 10. At the upstream end of the TRA/TRD locus on chromosome 10 lies the methyltransferase-like 3 (METTL3; Fig. 1), zinc finger protein (SALL2; not shown), and olfactory receptor (OR) loci in conserved synteny with human and mouse. The total size of the bovine TRA/TRD locus between the first V gene till the TRAC is 2.4 Mb.

Subgroups and homology to human genes

The nucleotide sequences of the novel and the previously described bovine V genes were used to generate a phylogenetic tree and were arranged in subgroups of >75% nucleotide identity (Fig. 2; Tables 1 and 2). For comparison, the human TRAV and TRDV were included. Of the 41 human TRAV subgroups, 11 are not represented in the bovine genome. Thirty human TRAV subgroups have bovine members and can thus be classified as interspecies subgroups. We found 11 bovine subgroups that are not represented in the human genome. As shown previously by others, the bovine and human TRDV1 subgroups are homologous to each other, as well as the bovine TRDV4 and human TRDV3 subgroups (Herzig et al. 2006; Su et al. 1999).

Fig. 2
figure 2

Phylogenetic tree of all bovine V genes of the TRA/TRD locus. Tree of all bovine TRAV and TRDV genes and one representative human TRAV or TRDV gene of each human subgroup. The tree is based on the nucleotide sequences of the V-EXON till the second cysteine (2nd-CYS 104). All novel genes are numbered 1391 and all previously described genes are shown under their previously published name. The V genes in each color-coded segment of the circle belong to one subgroup

Table 1 Bovine and human V gene subgroup assignments
Table 2 Summary and statistics of the bovine and human TRA/TRD loci

In bovine, in contrast to the highly expanded TRDV1 subgroup, the number of genes for the other subgroups is very limited and only two TRDV2, two TRDV3 (including one incomplete), and one TRDV4 genes have been described in the past. We found one additional TRDV2 gene (gene 221) and no additional TRDV3 and TRDV4 genes. The total number of bovine TRDV genes is 111. Most likely, the other 319 V genes are TRAV or TRAV/DV. Using a polymerase chain reaction-based approach, the existence of bovine V genes that are used in α and δ chains (TRAV/DV genes) has already been shown previously (Herzig et al. 2006).

Description of protein sequences and individual genes: some V genes lack a CDR2

Upon in silico translation of all bovine V genes, 86 were determined to be pseudogenes based on frameshift mutations or internal stop codons in the V-EXON of the V genes, and 336 V genes that did not contain such mutations were assessed as full-length functional genes. Of eight genes, the full-length coding sequence was not available. It is possible that the number of pseudogenes is slightly underestimated because some additional V genes may have mutations in the leader exon (L-PART1), encoding part 1 of the leader. The predicted amino acid sequences of the novel bovine V genes that are not pseudogenes were aligned. One representative bovine V gene of each subgroup is shown in Fig. 3a, upper part. Some individual V genes are aberrant in the sense that they have a mutated conserved first or second cysteine, a mutated conserved tryptophan, or considerable insertions or deletions compared with their subgroup members. These genes are shown separately (Fig. 3a, lower part).

Fig. 3
figure 3

Alignment of amino acid sequences of bovine V genes. a Alignment of predicted bovine TRAV and TRDV protein sequences. One representative of each subgroup is included in the alignment (upper part, labeled “Representatives”). In addition, some V genes with special features are shown (lower part, labeled “Atypical genes”). The CDR1 and CDR2 amino acids are colored as follows: gray positively charged R groups, light blue negatively charged R groups, light green aromatic R groups, pink polar uncharged R groups, yellow nonpolar aliphatic R groups, red conserved cysteines (1st-CYS 23 and 2nd-CYS 104) and conserved tryptophan 41. a Deletion in FR1 and CDR1 compared to the human homolog. b Deletion in FR2 compared to the human homolog. c Conserved Trp 41 is a Leu. d Conserved Cys 104 is a Val. e Conserved Cys 104 is a Trp. f Conserved Cys 23 is a Ser. g Deletion in FR2 and CDR2 compared to the human homolog. h Conserved Cys 104 is a Tyr. i Conserved Trp 41 is a Ser. j Conserved Cys 23 is a Phe. k Deletion in FR3 compared to the other genes of the subgroup. l Deletion in the FR2, CDR2 and FR3 compared to the human homolog. m Insertion of six amino acids in CDR1 and deletion in the FR2, CDR2, and FR3 compared to its human homolog. n Insertion of four amino acids in CDR1 and conserved Cys104 is a Val. b IMGT Collier-de-Perles of two atypical genes were created to compare their 2D structure with the 2D structure of a standard gene. Conserved amino acids (1st-CYS 23, Trp 41, hydrophobic amino acid 89, 2nd-CYS 104) always have the same position, based on the IMGT unique numbering for V-DOMAIN (Lefranc et al. 2003) and are marked red. CDR1 is shown in dark blue and the CDR2 in orange

The impact of the insertions or deletions of two particular V genes was studied by creating an “IMGT Collier-de-Perles” representation (Fig. 3b), illustrating that gene 330 and gene 258 have a nine-amino-acid deletion leading to the loss of the complete CDR2 and part of FR3. This nine-amino-acid deletion is present in a total of four V genes that are all part of the bovine TRDV1 subgroup. Gene 330 has an extremely long CDR1 (18 amino acids), which is considerably longer than the limit of 12 amino acids set by IMGT for the usual CDR1. Gene 330 combines these features, so it has an extremely long CDR1 and no CDR2.

Because the nine-amino-acid CDR2 and partial FR3 deletion was found in four V genes, we were interested to see if these genes are functionally rearranged and used by T cells. In the databases, there were two mRNA sequences present (accession numbers BC142414 and EF175173) that consisted of one of the V genes with a nine-amino-acid deletion in the CDR2/FR3, both functionally rearranged to different TRDD and TRDJ and spliced to TRDC, suggesting that these genes are functional.

The bovine homologs of the V genes used by the invariant TR α chain of mucosal-associated invariant T cells (MAIT) cells (human TRAV1–2, mouse TRAV1) and NKT cells (human TRAV10, mouse TRAV11, or TRAV11D) have been previously identified and were confirmed in the current study to be the closest possible bovine homologs of the human TRAV1–2 and TRAV10, respectively (and similarly of the mouse TRAV1 and TRAV11 or TRAV11D, respectively). Bovine gene 180 is the only bovine gene of the interspecies subgroup to which human TRAV1–2 and mouse TRAV1 belong, the V gene used by MAIT TR (Ishiguro et al. 1990; Tilloy et al. 1999). Interestingly, the V gene used by MAIT cells is the first one in the locus (in mice and cattle) or the second one (in human) and is interspersed between olfactory receptor genes (Glusman et al. 2001; Parra et al. 2008). Among the four bovine genes that form a subgroup with human TRAV10, three had already been identified as candidate V genes for hypothetical bovine NKT TR α chains (Looringh van Beeck et al. 2009)

Discussion

It has been previously shown that the number of TRDV genes and the potential and actual variability of γδ TR in the artiodactyls sheep, cattle, and pigs is much higher than in other species (Antonacci et al. 2005; Hein and Dudler 1993; Van Rhijn et al. 2007; Yang et al. 1995), and it has been suggested that this may relate to the fact that they are “γδ high” species. In this study, we show that the TRAV genes in cattle are also much more plentiful than in mice and humans, and the numbers of genes identified by our method of manual annotation greatly exceed a previous prediction based on automated gene annotation using the same assembly of the genome (Elsik et al. 2009). In addition to the high number of V genes described in this study, an excess of heterozygosity in the bovine TRA/TRD locus has been demonstrated (Fries et al. 2001). Despite the fact that the actual variability of αβ TR in artiodactyls remains to be determined, the existence of such a high number of bovine V genes elicits the question whether this implies an extended functionality and what evolutionary forces may have shaped this diversity.

From the available crystals of murine and human classical major histocompatibility complex (MHC) proteins with bound peptides and the αβ TR recognizing these complexes (Kaas et al. 2004; Rudolph et al. 2006), it is known that CDR1 and CDR2 mainly interact with the surface of the MHC protein, whereas CDR3 interacts with the peptide. Even though there is some variation in docking angle, interaction of all six CDR with the MHC–peptide complex is possible because the αβ TR docks approximately in a straight line on top of the MHC–peptide complex and the CDR have approximately the same length. This docking mechanism of TR on classical MHC–peptide complexes is highly similar in humans and mice and supported by a large set of data (Rudolph et al. 2006). However, for nonclassical antigen-presenting elements and/or γδ TR, extrapolations to other species are difficult because there is only a limited amount of data available and some nonclassical antigen-presenting elements and T cell populations are not distributed among all species. A few cases of direct recognition of a target molecule by an individual γδ TR have been described and include the murine nonclassical MHC proteins T10 and T22, which are absent in humans; the human MHC-I-like CD1c, which is absent in mice; and allo-MHC (Bluestone et al. 1988; Ito et al. 1990; Schild et al. 1994; Spada et al. 2000). The murine γδ TR G8, recognizing the T10 and T22 proteins (Adams et al. 2005), has been crystallized and shows that the very long CDR3 of the δ chain is responsible for most of the contact between the molecules. Because of the unequal length of the CDR loops, the TR interacts at an angle with its target molecule. No crystallographic data on bovine TR or antigen-presenting elements are available. However, based on a comparison with the known mode of interaction of human and murine TR with a classical MHC–peptide complex, the four CDR2-less bovine V genes are unlikely to recognize classical MHC–peptide complexes.