Skip to main content

A clonotype nomenclature for T cell receptors


T cell receptor (TCR) nucleotide sequences are often generated during analyses of T cell responses to pathogens or autoantigens. The most important region of the TCR is the third complementarity-determining region (CDR3) whose nucleotide sequence is unique to each T cell clone. The CDR3 interacts with the peptide and thus is important for recognizing pathogen or autoantigen epitopes. While conventions exist for identifying the various TCR chains, there is a lack of a concise nomenclature that would identify both the amino acid translation and nucleotide sequence of the CDR3. This deficiency makes the comparison of published TCR genetic and proteomic information difficult. To enhance information sharing among different databases and to facilitate computational assessment of clonotypic T cell repertoires, we propose a clonotype nomenclature. The rules for generating a clonotype identifier are simple and easy to follow, and have a built-in error-checking system. The identifier includes the V and J region, the CDR3 length as well as its human or mouse origin. The framework of this naming system could also be expanded to the B cell receptor.


A hallmark of immunity is the intrinsic ability to recognize and eliminate foreign molecules, cells, and organisms. The adaptive immune system is comprised of B and T cells. During T and B cell development these cells express unique heterodimeric receptors that can be used in pathogen recognition. Each of these receptor chains is generated by a somatic rearrangement process that joins different segments of the TCR and BCR genes and creates a novel gene. This joining process is imprecise with insertion of non-templated nucleotides (N nucleotides) in the junction site, as well as 3′- and 5′-nucleotide deletion from the germline genes participating in the rearrangement. This region of random nucleotide insertion or deletion referred to as the third complementarity-determining region (CDR3). The resulting CDR3 have a unique nucleotide sequence that is specific to that particular B or T cell and all its progeny; hence, the clonotypic nature of the receptors. The CDR3 is the portion of these receptors that is most involved in interactions with intact soluble antigens (B cells) or intracellular processed antigens presented as immunogenic peptides loaded in MHC molecules (T cells)

The initial phase of the adaptive immune responses involves B and T cell clonal selection on the basis of the structural complementarity of antigen-specific receptors to pathogen-derived epitopes (Davis and Chien 2003; Kolar and Capra 2003). The cells recruited into the immune response execute their effector function role. After pathogen clearance, a proportion of these cells will be retained as memory. Memory provides more rapid and effective immune protection against recurring pathogen present in the environment. The collection of cells that respond to a particular pathogen is referred to as the repertoire.

T and B cells can also be implicated in responses to non-pathogenic environmental stimuli (allergies). More serious is the lack of tolerance to self that results in responses to self-antigens giving rise to autoimmune disease. In each case, a repertoire of allergen- or self-specific B or T cells is generated.

The repertoire recognizing a molecule would be a sum of the repertoires responding against all the component epitopes of the molecule. The repertoire against an organism would be the sum of all the repertoires against all the molecules from the pathogen.

Measuring an immune response at the level of the repertoire is becoming very common (Correia-Neves et al. 2001; La Gruta et al. 2008; Naumov et al. 1998; Pewe et al. 2004; Probert et al. 2007; Venturi et al. 2008). An antigen-specific response can be viewed in the context of how many T cells are recruited and the structure of their antigen receptors. The nature of the naïve and antigen-experienced cells repertoire is of interest in basic and clinical immunology, immune-pharmaceutics, and vaccine development. However, comparison of datasets from similar, or even identical, experiments from different laboratories is cumbersome due to lack of the unified clonal identification procedure where the clonotypic antigen-receptor serves a marker of clonal identity. Having a quick way to assign specific identifiers for specific receptor sequence would facilitate such comparison studies.

There are two subsets of T cells based on the exact pair of receptor chains expressed. These are either the alpha (α) and beta (β) chain pair, or the gamma (γ) and delta (δ) chain pair, identifying the αβ or γδ T cell subsets, respectively. The expression of the β and δ chain is limited to one chain in each of their respective subsets and this is referred to as allelic exclusion (Bluthmann et al. 1988; Uematsu et al. 1988). These two chains are also characterized by the use of an additional DNA segment, referred to as the diversity (D) region during the rearrangement process. The D region is flanked by N nucleotides which constitutes the NDN region of the CDR3 in these two chains.

The CDR3 of each of the two receptor chains defines the clonal specificity. For αβ T cells the CDR3 is in most contact with the peptide bound to the MHC (Rudolph et al. 2006). For this reason, CDR3 sequences have been the main focus for sequencing studies. In the past three decades, TCR clone sequences have been presented in publications in many different forms. Some, using an alias as an identifier and present a whole nucleotide sequence of a clone by identifying the V, D, and J segments (Elliott et al. 1988). In some publications, the information about the V and the J usage and the amino acids of the V/NDN/J junction sequences (Kent et al. 2005) are given, while in other publications, both nucleotides and amino acid sequences of all different segments that have been recombined to make up the CDR3 region of the TCR clones are given (Maslanka et al. 1996; Naumov et al. 1998; Shin et al. 2005). However, a full sequence could be quite bulky. Often, for simplicity, each sequence is assigned its own alias that could be a number or a combination of letters and numbers to ease the tracking of information (Cameron et al. 2002; Chien et al. 1987; Correia-Neves et al. 2001; Davis and Bjorkman 1988; Elliott et al. 1988; Kalams et al. 1994; La Gruta et al. 2008; Lehner et al. 1995; McHeyzer-Williams and Davis 1995; Naumov et al. 1998, 2006; Pewe et al. 2004; Venturi et al. 2008). With the arrival of new ultra-high throughput or massively parallel sequencing techniques these data sets are bound to grow larger. Without a proper standardization, the general compilation of such information across published and documented data sources is problematic. Thus, there is a need for a nomenclature which allows to properly enumerating the TCR chains and tracing them to the T cell clones.

The primary purpose of this naming system is to have a unique identifier for the CDR3 of each TCR chain, so that information about the T cell clones in publications, databases, and other forms of communication can be unambiguously associated with the correct T cell clone. The proposed nomenclature is intended to provide the immunology community an easy route to share genetic information about clonal and clonotypic T cell receptors.

Materials and methods

T cell clonotypes

To properly document and enumerate TCR CDR3, we have developed a working definition of a clonotype and a nomenclature that reflects the sequence information of the CDR3 of that particular receptor:

  1. 1.

    A TCR clonotype is a unique nucleotide sequence that arises during the gene rearrangement process for that receptor. The combination of nucleotide sequences for the surface expressed receptor pair would define the T cell clonotype.

  2. 2.

    Clonotyping is a process to identify the unique nucleotide CDR3 sequences of a TCR chain. This generally involves PCR amplification of the cDNA using V-region-specific primers and either constant region (C) specific or J-region-specific primer pairs, followed by nucleotide sequencing of the amplicon.

  3. 3.

    Clonotype nomenclature is the system for assigning identifiers and tracing records of clonotype identification.

The clonotype nomenclature

The clonotype nomenclature refers to a system of names that are fully controlled through explicit and rigid syntactic rules. We have also defined a list of desired features for a clonotype identifier that allows computational assignment. To minimize the identifier length and to maintain clarity a mix of letters and digits is used. There is a firm restriction on the use of capitalization and character formatting in a formal name. For the clonotype nomenclature, lowercase letters are reserved for amino acid sequences in the V and J regions and uppercase letters are reserved for amino acid sequences in the region between the V and J. Thus, the uppercase corresponds to amino acids encoded by the N or NDN regions. To make names fully computable, we are avoiding the use of subscripts, superscripts, accents, and word separators; Greek symbols are replaced by uppercase Roman letters; the period (‘.’) is used as a symbol separator.

The clonotype-naming process

The name contains information on amino acid sequence originating from V, J, and NDN regions. The name consists of five segments: (1) CDR3 amino acid identifier, (2) CDR3 nucleotide sequence identifier, (3) variable (V) segment identifier, (4) joining (J) segment identifier, and (5) CDR3 length identifier. The name can be constructed and deconstructed in the same manner. Access to a standard genetic code table and the germline configuration of the V and J regions identified in the name is all that is needed to reconstruct the actual nucleotide sequence of the clonotype.

The rules for clonotype naming are as follows:

  1. 1.

    CDR3 amino acid identifier

This segment uses the one-letter code and always starts and ends with a lowercase letter. The starting lowercase letter represents the last amino acid from the V segment which is completely (all three nucleotides) encoded from the V region. The final lowercase letter represents the first amino acid entirely encoded by the J region. Uppercase letters represent amino acids that are encoded fully or in part by the NDN region.

  1. 2.

    Nucleotide sequence identifier (ID)

A series of digit numbers (ID) with a leading period for a symbol separator is reserved for a nucleotide identifier. Each digit in this number reflects the specific codon for each uppercase amino acid in the name. These numbers are not limited and appear in the same order as their amino acid counterparts. The identifier assignment is based on the standard codon table (Table 1). The codons for each amino acid are numbered sequentially from top to bottom and then across and down for the six codon amino acids. The three termination codons are assigned “O” and numbered 1 for Ochre (TAA), 2 for Amber (TAG), and 3 for Opal (TGA). The letter “O” is chosen because two of the three terminators start with this letter and no amino acid is associated with this letter.

  1. 3.

    TCR V region identifier

Table 1 Genetic codes and their assigned ID numbers

The V gene family (also referred to as group) is identified by an uppercase Roman letter followed by a specific subfamily (also referred to as subgroup) identifier. In order to sort the clonotypes based on the V gene usage we assign a fixed number of characters for the V gene subfamilies. The names of the human V gene subfamilies are as originally described by Hood and colleagues (Rowen et al. 1996). Each V gene is assigned a subfamily number (two digits) followed by S and another number to define the subfamily member. In the case of TCR AV and TCR BV genes, some subfamilies have more than one member. The members are identified by S1, S2, S3, … for human and −1, −2, −3, … for mouse. The names of the mouse V genes are based on the ImMunoGeneTics (IMGT) database (Giudicelli et al. 2005). For mouse distal V alpha genes that are repeats of the proximal ones, we omitted the “–” in the name to keep the total characters to five, similar to that for human V genes. The breakdown of the assigned characters is shown in Table 2.

Table 2 Breakdown of assigned characters for human and mouse V genes in clonotype identifier

The identification of a subfamily member from a TCR sequence focused on the CDR3 depends on the specificity of the V region primer and the sequence homology of the subfamily members in the DNA segment 3′ of the V primer. If the primer is specific enough to distinguish a specific subfamily member, the clonotype name will have the specific subfamily member’s name. If the primer pairs to the region that all subfamily members have identical sequences, then the DNA sequence 3′ of the primer will determine the TCRV name. If all subfamily members have identical sequence for this region, the subfamily member’s name will be SX for human, and −X for mouse. If some family members can be defined but others cannot, the indistinguishable subfamilies are referred to using Y and Z. The possible members that comprise Y and Z should be further explained.

Some AV genes can rearrange to either alpha J genes (resulting in a TCR alpha chain) or delta J genes (resulting in a TCR delta chain). These are called ADV genes. Based on their location in the AV locus region, we simplify the nomenclature by using the alpha gene name. The J region identifier then specifies to which constant region the VA is linked. Shown in Table 3 are the human and mouse alpha/delta genes and our corresponding nomenclature. There are two genes that do not follow this rule; the human delta V1 gene which is located between the AV23 and AV24 genes only rearranges to the delta J genes and yet has not been found rearranging to alpha J genes, and mouse AV15-2/DV6-2 and AV15D-2/DV6-2 genes are similar and yet have not been found rearranging to the alpha J genes. In our naming system, the delta V name will be used for these genes; D1 for human and D6-2 for mouse.

Table 3 Human and mouse α/δ gene assignment

Allelic forms of V regions exists ( Currently, the clonotype nomenclature does not account for these. They could be identified by enlarging the V region identifier by one or two characters. The need for this level of characterization is unclear at this time so the identifier has been kept shorter for sake of usability.

  1. 4.

    TCR J region identifier

The J gene identifier appears after the V gene identifier. The J gene family is expressed by Roman letters as defined for the V gene identifier above. Human (Rowen et al. 1996) and mouse (Giudicelli et al. 2005) J genes are named as described. For sake of brevity and to facilitate sorting, the “S” for designation of subfamily members in human and the “−” for designation of the subfamily members for mouse is dropped, resulting in a two-digit number. The detail of J character assignment is shown in Table 4. It should be noted that there are five subfamily in human gamma J family; 1, 2, P, P1, and P2, that two of the subfamilies have been identified by assigned numbers (gamma J1 and gamma J2), one has a been identified by assigned letter (gamma JP) and two have been identified by a letter and a number (gamma JP1 and gamma JP2). In order to have the same characters for all human gamma J genes, we are assigning a number to the ones that do not have a number identifier as follows; GJP = GJ3, GJP1 = GJ4, and GJP2 = GJ5.

Table 4 Breakdown of assigned characters for human and mouse J genes in clonotype identifier

There are a number of alleles of J regions that have been reported (Lefranc and Lefranc 2001 & Currently, the nomenclature does not take these into account. If needed, the J identifier could be extended by one character to include an allele identifier.

  1. 5.

    CDR3 length identifier

The length of the clonotype is determined by the number of amino acids between the C-terminal-conserved cysteine (C) of the V region, and phenylalanine (F) of the J region which is part of the FG×GT conserved motif in all J regions. The C and the F are not counted in the length. The number representing the length is preceded by a letter L that serves as a symbol separator. In order to sort the clonotypes based on their length, we assigned three characters for the length, the first being the letter “L”, followed by the two digits specifying the length.

Results and discussion

Generating a TCR clonotype identifier

The use of the nomenclature is demonstrated for a TCR β-chain clonotype from our studies of CD8 T cells from HLA-A2.1 individuals responding against the influenza A matrix protein M1-derived peptide, M158–66 (Fig. 1). The nucleotide sequence is shown in the center with the amino acid translation underneath. The nucleotides corresponding to the NDN region are bold and underlined. The last amino acid which is completely encoded from the V gene (agt) is Serine, so the clonotype name will start with lowercase letter “s”. Amino acids in the NDN region (IRSS) are presented as uppercase letters. It should be noted that I is partially encoded by the V gene and the second S is partially encoded by the J gene. The first amino acid which is completely encoded from the J gene is Tyrosine (tac) and is denoted as a lowercase “y”. A period is used as a symbol separator between the amino acid identifier and the nucleotide identifier. Each uppercase letter in the NDN region (IRSS) gets a nucleotide identifier number based on the codon table (Table 1). An Ile that is encoded by ATT gets number 1, Arg that is encoded by CGG gets number 4, Ser that is encoded by AGT gets number 5, and Ser that is encoded by AGC gets number 6. In this case, the four-digit numbers “1456” identifies the nucleotide sequence of NDN region (atT, CGG, AGT, AGc). The origin of the V gene is expressed by B for beta, followed by the subfamily19 and subfamily member S1. The “S” in the V region identifier indicates that the clonotype is derived from human T cell. The origin of the J gene is shown by B for beta and subfamily 2 and the subfamily member 7. The length of the clonotype is the number of amino acids between the cysteine (C) of the V region and phenylalanine (F) of the J region which is part of FG×GT, which in this case is 11. Thus, the length identifier is the letter “L” for length followed by number 11.

Fig. 1
figure 1

An example of TCR β-chain clonotype identifier. The BV and the BJ regions are fully identified. The single-letter -code amino acid translation is shown below the nucleotide sequence. The bold uppercase letters represent the conserved amino acids from the V (C) and from the J (FG). The amino acids that are not completely encoded by the germline, which are predominantly encoded by the NDN, are also in uppercase (IRSS). Below the NDN-encoded amino acids is the codon ID for each of them as assigned from the Table 1. The bold underlined lowercase letters represent the last amino acid that is completely encoded by the V gene (s) and the first amino acid that is completely encoded by the J region (y). The clonotype identifier takes the uppercase NDN amino acids and flanks them with the lowercase V and J encoded amino acids. This is followed by the codon ID for the uppercase NDN sequence. The V and J chains are next identified. Finally, the length of the CDR3 is determined by counting the number of amino acids between the uppercase C and uppercase FG. This count is shown in the top line

It should be noted that the example shown here does not account for allelic differences in the V or the J genes. An example of the same clonotype identifier with the J allele information would be sIRSSy.1456B19S1B271L11, with the first two digits of the J identifier specifying the subfamily member (2S7) and the final digit specifying the allele.

V region nomenclature

For TCRAV and TCRBV, some BV subfamilies have more than one member. The identification of the subfamily members depends on two factors. The first is the specificity of the V region primer that is used for amplifying the particular V subfamily member. Primers could be designed that are specific for only one subfamily member. If the primer is specific enough to anneal only to one of the V subfamily members, then the clonotype identifier will use the subfamily member’s name such as S1, S2, S3 (for human), and −1, −2, −3 (for mouse). The second factor is the sequence homology between the subfamily members in the region 3′ of the V primer up to the conserved cysteine, the nucleotide differences downstream the conserved cysteine is not considered due to possibility of excision during the rearrangement process. For some choices of primer, there may be sufficient differences in the region between the primer and the conserved cysteine that the particular subfamily member can be identified. If this is the case, then the name of subfamily member is used. In other cases, the sequence between the V primer and the conserved cysteine is associated with multiple sequences. We reserve the letter X for use if the primer does not allow any distinction of subfamily members. Y and Z can be used to designate subsets of possible subfamily members and these must be defined. These designations will be specific for the primers used and once defined can be used over and over. An example of V gene identification is shown in Supplementary Table 1.

Identifying other chains

Additional examples of using the nomenclature for human α-TCR, β-TCR, γ-TCR, and δ-TCR are shown in Table 5. Since the δ-chain could be the result of either Vδ- or Vα-chain genes rearranging to Jδ, we show an example of the naming of both such possibilities (examples 4 and 5).

Table 5 Examples of human TCR α, β, γ, and δ clonotype identifiers

Decoding TCR clonotype identifier

By decoding the name, the nucleotide sequence of the TCR chain can be derived in a reverse manner as that used for the encoding. Using the first example shown in Table 5 “rTs.4A38S2A53L13”, the “A38S2” and “AJ53” shows that the clonotype origin is human and the sequences of the alpha V38S2 and alpha J53 genes are needed for the decoding (Fig. 2). The entire length between the “C” and “FG” is 13 amino acids long. So the genomic sequence of the TCR AV38S2 is obtained and the positions of the amino acids lined up with a length ruler starting with the position immediately after the conserved cysteine. The TCR AJ53 is then placed so that the last amino acid before the conserved FG lines up with the end of length ruler. The two lowercase letters “r” and “s” in the name identify the last V and the first J position encoded by the germline. This leaves one position to be filled by the N nucleotides and this is the threonine represented in the clonotype name as “T”. The codon table shows that the codon 4 for T is ACG. The T can be encoded entirely by N nucleotides (ACG) or the initial nucleotide, a, could be V-germline-encoded and the rest of the sequence is N derived. While Fig. 2 uses the second possibility (aCG), the sequence is decoded regardless.

Fig. 2
figure 2

Deriving the nucleotide sequence of the CDR3 by decoding the clonotype TCR β-chain identifier. The genomic sequence of the TCRV gene (AV38S2) is obtained and the positions of the amino acids lined up with a length ruler starting with the position immediately after the conserved cysteine. The TCRJ gene (AJ53) is then placed so that the last amino acid before the conserved FG lines up with the end of length ruler. The two lowercase letters “r” and “s” in the name identify the last V and first J position encoded by germline. This leaves one position to be filled by the N nucleotides and this is the threonine represented as “T” in the clonotype name. The codon table shows that codon 4 for T is ACG, and the only way that the T can be encoded is that the initial nucleotide, A, is from the V germline sequence and the rest of the sequence is N derived

D regions

Our nomenclature does not define the D region of the clonotype. The D regions could be defined after decoding by a homology search. Because of the truncation of D regions, they are often difficult to unambiguously assign. Defining the D region usage would be left to the individual investigator.

Properties of the naming system

The nomenclature described here has a number of important properties:

  1. 1.

    The nomenclature is exhaustive: it ensures that each clonotypes has an identifier.

  2. 2.

    The nomenclature is compact: an identifier is relatively short.

  3. 3.

    The nomenclature is an open system in that the identifiers are not restricted. Therefore, it allows expansion or addition of identifiers.

  4. 4.

    The nomenclature is compartmental allowing manipulation of identifiers. Manipulations could be sort, select, arrange, etc. Single or combinations of identifiers can be manipulated.

Advantages of the naming system

By having these characteristics, the nomenclature has several general advantages. By combining all five elements of the CDR3 region, this system permits any clonotype to be defined. Our nomenclature is more compact than either a nucleotide- or amino-acid-based naming system. It is two-thirds shorter than the CDR3 nucleotide sequences, while still describing the nucleotide sequence. The CDR3 amino acid sequence is pared to the NDN contribution only. It distinguishes clonotypes that use different encoding for the same CDR3 amino acid sequence. For example, we have found 207 different clonotypes that use the same BV19S1 and the same BJ2S7, and have the exactly the same CDR3 amino acid sequence (CASSIRSSYEQYF). Even if the amino acid identifier of the name is the same without nucleotide identifier, it is impossible to distinguish between them. We show some examples of this in Table 6 from our analysis of the HLA-A2-restricted response to influenza M158–66. This shows the power of the nomenclature for defining population studies that deal with a large number of similar clonotypes.

Table 6 Examples of different human clonotype sequences coding identical amino acids in the CDR3β

By being compartmental, the proposed nomenclature can enumerate all possible names. Each compartment is an identifier. While it is unlikely that new J or V regions will be uncovered in mice or man, these could be easily absorbed into the name. The compartmentalization allows the level of identification of the V region to reflect in the name. If the identification of polymorphic variants of either V or J regions becomes important, the size of the compartment for these regions could be expanded to facilitate the addition. If the system were to be used for naming of BCR, an identifier for the heavy chain constant region could be added after the J identifier. The structure allows these identifiers to be fully computable and the character assignment of gene identifier makes it easy to sort based on the V gene, J gene, and the length. It also supports a built-in error checking for the digits in ID and number of amino acids in the NDN region, which is a one to one relation for all functional TCR clones. For example, if there are four amino acids in the NDN region, there would be four-digit numbers in the ID part of the name and the errors are easily found.

Comparing TCR clonotypes

To the extent that clonotypes are public (1), they can be identified in many laboratories. Thus, a fixed nomenclature will avoid difficulties associated with local identifiers. For example, the M1 response in HLA-A2 individuals has been studied by many groups. We show that some of the published clonotypes identified by Moss et al. in 1991 and by Lehner et al. in 1995, and us (Naumov et al. 1998, 2006) were observed in more than one study (Table 7). This shows the power of a common robust naming system in comparing the results of related studies that have been published independently.

Table 7 Identifiers of the HLA-A2.1:M158–66-specific clones/clonotypes found in multiple studies

Alternative codon numbering systems

We also examined an alternative approach for codon numbering. We used the same codon numbering table, as described above, but then generated a list of all the possible ways for encoding of a particular NDN sequence. The observed sequence is then defined by its index position on the list. The IRSS amino acid sequence could be used as an example: when the clonotype identifier encodes I1 R1, S1, S1, ID number would be 1 instead of 1111. When the clonotype identifier encodes I1 R1, S1, S2, ID number would be 2 instead of 1112. Since IRSS has 648 possible encoding combinations (3 × 6 × 6 × 4), the identifiers would be shorter using one to three characters. However, the disadvantage of using this approach is that it is less direct and requires a computer program for optimal implementation. This takes away the ability for an individual to manually identify or decode a particular sequence.


We have implemented a rational nomenclature system that makes the TCR sequences easier to read and compare. The nomenclature rules are simple and easy to implement. Having the codon table available, it would be easy to name any TCR clonotype or clone without developing customized naming software. Nevertheless, the rules are simple enough to be encoded in computer programs. It has a built-in error-checking system which is the one to one correspondence between each digit in ID and each uppercase amino acid in the clonotype identifier. The benefits of our consistent nomenclature would accrue exponentially as the number of TCR under the study increases.

Implementing this nomenclature would facilitate deployment of clonotype databases run by individual laboratories for specific immune responses and immune diseases. The clonotype names within these databases would be reliable, error-free, and allow easy cross-referencing and comparison of T cell repertoires by different laboratories. The clonotypes could be cataloged in a single database and annotated as to their occurrence and associations with particular responses. Such a catalog is only possible by providing an easy-to-use nomenclature. T cell clones can be unambiguously identified by naming both chains. The same identification could be provided for single-cell PCR data where both chain sequences are available. The framework of this naming system could also be implemented for B cell receptors if a system was added to account for somatic hypermutation. Such a convention would open up the possibility of creating a BCR catalog which would be a useful tool for investigators working on BCR repertoires.


  • Bluthmann H, Kisielow P, Uematsu Y, Malissen M, Krimpenfort P, Berns A, von Boehmer H, Steinmetz M (1988) T-cell-specific deletion of T-cell receptor transgenes allows functional rearrangement of endogenous alpha- and beta-genes. Nature 334:156–159. doi:10.1038/334156a0

    Article  CAS  PubMed  Google Scholar 

  • Cameron TO, Cohen GB, Islam SA, Stern LJ (2002) Examination of the highly diverse CD4(+) T-cell repertoire directed against an influenza peptide: a step towards TCR proteomics. Immunogenetics 54:611–620. doi:10.1007/s00251-002-0508-y

    Article  CAS  PubMed  Google Scholar 

  • Chien YH, Iwashima M, Wettstein DA, Kaplan KB, Elliott JF, Born W, Davis MM (1987) T-cell receptor delta gene rearrangements in early thymocytes. Nature 330:722–727. doi:10.1038/330722a0

    Article  CAS  PubMed  Google Scholar 

  • Correia-Neves M, Waltzinger C, Mathis D, Benoist C (2001) The shaping of the T cell repertoire. Immunity 14:21–32. doi:10.1016/S1074-7613(01)00086-3

    Article  CAS  PubMed  Google Scholar 

  • Davis MM, Bjorkman PJ (1988) T-cell antigen receptor genes and T-cell recognition. Nature 334:395–402. doi:10.1038/334395a0

    Article  CAS  PubMed  Google Scholar 

  • Davis MM, Chien YH (2003) T cell antigen receptors. In: Paul WE (ed) Fundamental immunology, 5th edn. Lippincott Williams & Wilkins, Philadelphia, pp 227–258

    Google Scholar 

  • Elliott JF, Rock EP, Patten PA, Davis MM, Chien YH (1988) The adult T-cell receptor delta-chain is diverse and distinct from that of fetal thymocytes. Nature 331:627–631. doi:10.1038/331627a0

    Article  CAS  PubMed  Google Scholar 

  • Giudicelli V, Chaume D, Lefranc MP (2005) IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes. Nucleic Acids Res 33:D256–D261. doi:10.1093/nar/gki010

    Article  CAS  PubMed  Google Scholar 

  • Kalams SA, Johnson RP, Trocha AK, Dynan MJ, Ngo HS, D'Aquila RT, Kurnick JT, Walker BD (1994) Longitudinal analysis of T cell receptor (TCR) gene usage by human immunodeficiency virus 1 envelope-specific cytotoxic T lymphocyte clones reveals a limited TCR repertoire. J Exp Med 179:1261–1271. doi:10.1084/jem.179.4.1261

    Article  CAS  PubMed  Google Scholar 

  • Kent SC, Chen Y, Bregoli L, Clemmings SM, Kenyon NS, Ricordi C, Hering BJ, Hafler DA (2005) Expanded T cells from pancreatic lymph nodes of type 1 diabetic subjects recognize an insulin epitope. Nature 435:224–228. doi:10.1038/nature03625

    Article  CAS  PubMed  Google Scholar 

  • Kolar GR, Capra JD (2003) Immunoglobulins: structure and function. In: Paul WE (ed) Fundamental immunology, 5th edn. Lippincott Williams & Wilkins, Philadelphia, pp 47–68

    Google Scholar 

  • La Gruta NL, Thomas PG, Webb AI, Dunstone MA, Cukalac T, Doherty PC, Purcell AW, Rossjohn J, Turner SJ (2008) Epitope-specific TCRbeta repertoire diversity imparts no functional advantage on the CD8+ T cell response to cognate viral peptides. Proc Natl Acad Sci USA 105:2034–2039. doi:10.1073/pnas.0711682102

    Article  PubMed  Google Scholar 

  • Lefranc MP, Lefranc G (2001) The T cell receptor facts book. Academic, London

    Google Scholar 

  • Lehner PJ, Wang EC, Moss PA, Williams S, Platt K, Friedman SM, Bell JI, Borysiewicz LK (1995) Human HLA-A0201-restricted cytotoxic T lymphocyte recognition of influenza A is dominated by T cells bearing the V beta 17 gene segment. J Exp Med 181:79–91. doi:10.1084/jem.181.1.79

    Article  CAS  PubMed  Google Scholar 

  • Maslanka K, Yassai MB, Gorski J (1996) Molecular identification of T cells that respond in a primary bulk culture to a peptide derived from a platelet glycoprotein implicated in neonatal alloimmune thrombocytopenia. J Clin Invest 98:1802–1808. doi:10.1172/JCI118980

    Article  CAS  PubMed  Google Scholar 

  • McHeyzer-Williams MG, Davis MM (1995) Antigen-specific development of primary and memory T cells in vivo. Science 268:106–111. doi:10.1126/science.7535476

    Article  CAS  PubMed  Google Scholar 

  • Moss PA, Moots RJ, Rosenberg WM, Rowland-Jones SJ, Bodmer HC, McMichael AJ, Bell JI (1991) Extensive conservation of alpha and beta chains of the human T-cell antigen receptor recognizing HLA-A2 and influenza A matrix peptide. Proc Natl Acad Sci USA 88:8987–8990. doi:10.1073/pnas.88.20.8987

    Article  CAS  PubMed  Google Scholar 

  • Naumov YN, Hogan KT, Naumova EN, Pagel JT, Gorski J (1998) A class I MHC-restricted recall response to a viral peptide is highly polyclonal despite stringent CDR3 selection: implications for establishing memory T cell repertoires in "real-world" conditions. J Immunol 160:2842–2852

    CAS  PubMed  Google Scholar 

  • Naumov YN, Naumova EN, Clute SC, Watkin LB, Kota K, Gorski J, Selin LK (2006) Complex T cell memory repertoires participate in recall responses at extremes of antigenic load. J Immunol 177:2006–2014

    CAS  PubMed  Google Scholar 

  • Pewe LL, Netland JM, Heard SB, Perlman S (2004) Very diverse CD8 T cell clonotypic responses after virus infections. J Immunol 172:3151–3156

    CAS  PubMed  Google Scholar 

  • Probert CS, Saubermann LJ, Balk S, Blumberg RS (2007) Repertoire of the alpha beta T-cell receptor in the intestine. Immunol Rev 215:215–225. doi:10.1111/j.1600-065X.2006.00480.x

    Article  CAS  PubMed  Google Scholar 

  • Rowen L, Koop BF, Hood L (1996) The complete 685-kilobase DNA sequence of the human β T cell receptor locus. Science 272:1755–1762. doi:10.1126/science.272.5269.1755

    Article  CAS  PubMed  Google Scholar 

  • Rudolph MG, Stanfield RL, Wilson IA (2006) How TCRs bind MHCs, peptides, and coreceptors. Annu Rev Immunol 24:419–466. doi:10.1146/annurev.immunol.23.021704.115658

    Article  CAS  PubMed  Google Scholar 

  • Shin S, El-Diwany R, Schaffert S, Adams EJ, Garcia KC, Pereira P, Chien YH (2005) Antigen recognition determinants of gammadelta T cell receptors. Science 308:252–255. doi:10.1126/science.1106480

    Article  CAS  PubMed  Google Scholar 

  • Uematsu Y, Ryser S, Dembic Z, Borgulya P, Krimpenfort P, Berns A, von Boehmer H, Steinmetz M (1988) In transgenic mice the introduced functional T cell receptor beta gene prevents expression of endogenous beta genes. Cell 52:831–841. doi:10.1016/0092-8674(88)90425-4

    Article  CAS  PubMed  Google Scholar 

  • Venturi V, Price DA, Douek DC, Davenport MP (2008) The molecular basis for public T-cell responses? Nat Rev Immunol 8:231–238. doi:10.1038/nri2260

    Article  CAS  PubMed  Google Scholar 

Download references


We thank Dr. Marie-Paule Lefrance for clarification of the mouse TCR nomenclature. We also thank Dr. Andrea Ferrante for helpful discussions. This work was funded by the National Institutes of Health Grant U19 AI062627.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Author information



Corresponding author

Correspondence to Maryam B. Yassai.

Electronic supplementary materials

Below is the link to the electronic supplementary material.

Supplementary Table 1

Examples of human TCRBV subfamilies that have more than one member: sequence homology and their assigned name (PDF 2137 kb)

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (, which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and Permissions

About this article

Cite this article

Yassai, M.B., Naumov, Y.N., Naumova, E.N. et al. A clonotype nomenclature for T cell receptors. Immunogenetics 61, 493–502 (2009).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • TCR
  • Nomenclature
  • CDR3
  • Clonotype