Introduction

Rhesus macaques are widely used as preclinical models for human infectious and autoimmune diseases, of which HIV and multiple sclerosis are examples, as well as for transplantation and vaccine development research (Evans et al. 1999; Brok et al. 2001; Horton et al. 2001; Wood et al. 2001; Muhl et al. 2002; Newberg et al. 2002; Mothe et al. 2003; O’Connor et al. 2003; Friedrich et al. 2004; Knechtle and Burlingham 2004; Lee et al. 2004; Torrealba et al. 2004). Gene products of the major histocompatibility complex (MHC) play a key role in adaptive immunology, and a prominent feature of most of the genes in this region is their high degree of polymorphism. A well-characterized rhesus macaque (Mamu) MHC is a prerequisite for various aspects of biomedical research. By active immunization of rhesus macaques, mainly of Indian origin, 14 Mamu-A and 16 -B serotypes were defined (Bontrop et al. 1995; Otting et al. 2005). However, serological typing for non-Indian animals, such as those originating from China or Southeast Asia, is compromised by the lack of well-defined, specific antisera. Thus, comprehensive molecular typing methods, such as sequence-based or allele-specific amplification, have to be established for the Mamu system of rhesus macaques from different geographic sources.

In humans, class I genes HLA-A, -B, and -C and class II genes HLA-DPA1, -DPB1, -DQA1, -DQB1, and -DRB1 exhibit a high degree of allelic variation. In the rhesus macaque, most of these loci are also present and known to be polymorphic. Mamu-DQA1 and -DQB1 are highly variable and segregate as stable DQA1/DQB1 haplotypes (Khazand et al. 1999; Doxiadis et al. 2001). The most striking difference, however, can be observed for the class I -A and -B loci. Whereas in humans there is only one HLA-A and one -B locus per chromosome, each with a high degree of polymorphism, in cynomolgous and rhesus macaques there are multiple expressed Mamu-A and -B like loci on a single haplotype (Boyson et al. 1996; Erlich et al. 1996; Uda et al. 2004). Moreover, Mamu-A and -B region configurations display diversity with regard to the number and combination of loci transcribed per chromosome (Otting et al. 2005). The Mamu-DRB region is comparable to the class I, since more than 30 -DRB region configurations have been described that vary in loci number and content (Doxiadis et al. 2000, 2001). Each Mamu-DRB region configuration is composed of one to three transcribed -DRB genes, and up to five pseudogenes per chromosome. Contradictory chromosomal assignments have been published for the rhesus macaque MHC, first to Chr. 2 and later to Chr. 5 (Garver et al. 1980; Hirai et al. 1991). Recently, fluorescence in situ hybridization mapping of six rhesus macaque cosmid clones localized the MHC on the long arm of Chromosome 6 in 6q24, the orthologous region to human 6p21.3 (Huber et al. 2003).

The recent completion of the rhesus macaque MHC sequence (Daza-Vamenta et al. 2004) confirmed previous findings of variation in number and content for class I and II genes and revealed an overall similarity of organization with the human orthologue. This conserved organization was offset by internal expansions, most notably of Mamu-A and Mamu-B genes, which explained the difference in length of the region of 5.3 Mb in rhesus macaque and about 3.7 Mb in human.

Because the molecular typing of class I and II genes is complicated and time-consuming, an analysis of polymorphic microsatellites or short tandem repeats (STRs) spanning the MHC provides an alternative method for rapid and accurate characterization of the region. Such an approach has been used for MHC typing in humans for tissue matching and donor screening (Carrington and Wade 1996; Foissac et al. 2001). Hundreds of STRs are situated on human Chr. 6, and lists of markers mapping within or near the HLA region have been compiled and updated (Tamiya et al. 1999; Foissac et al. 2000; Matsuzaka et al. 2000, 2001; Cullen et al. 2003). Because STRs tend to be conserved among closely related species, especially between Old World monkeys and hominoids (Rubinsztein et al. 1995; Coote and Bruford 1996; Clisson et al. 2000; Rogers et al. 2000), HLA-linked STRs provide an abundant source of potential markers for use in rhesus macaques.

Among 37 STRs screened for robust amplification and polymorphism in rhesus macaque, eight markers, D6S291, D6S2741, D6S2876, DRA-CA, MICA, MOG-CA, D6S1691, and D6S276, were selected that spanned the HLA region (Martin et al. 1998; Foissac et al. 2000; Cullen et al. 2002, 2003). The remaining 29 markers were excluded, mainly because the human primers failed to amplify rhesus macaque DNA. This study sought to characterize the polymorphism of the eight STRs in rhesus macaques of Indian and Chinese origin and to evaluate the association of STR variants with alleles of the MHC class I and II genes. The ability to derive extended haplotypes for the MHC region provides additional relevant information that can be applied to experimental designs in biomedical, population, evolution, and cell biology research.

Material and methods

Animals

For haplotype analyses with STRs, class I, and class II loci, 118 rhesus macaques from the self-sustaining colony of the Biomedical Primate Research Centre (BPRC), with a breeding history of more than five generations, were tested. Most of the founder animals were from India, but animals from China and Burma were also present. The animals belonged to six breeding groups, each consisting of one alpha male, several females, and their offspring. The smallest group comprised three females and six offspring and the largest six females and 25 offspring. Four of these breeding groups had founders of Indian origin. One group of animals was of Burmese origin and all females of the last group originated in India, whereas the male was an Indian/Chinese crossbred macaque (Doxiadis et al. 2003).

Polymorphism of STR loci was analyzed in two groups of unrelated rhesus macaques from the breeding colony of the California National Primate Research Center (CNPRC), University of California, Davis, Calif., USA. These groups represented animals originating in India (n=51) and China (n=44). DNA samples were obtained from the archive of the Veterinary Genetics Laboratory (VGL), University of California.

For linkage analyses of STR markers, genotype data for seven paternal half- and full-sib families from the CNPRC colony with a total of 331 offspring–dam pairs were obtained from VGL’s database.

Serological MHC typing

The BPRC rhesus macaques were serologically typed for MHC class I antigens, and 14 Mamu-A and 16 Mamu-B serotypes were defined. Serological assays were performed by a cytotoxicity test using specific antibodies produced by the active immunization of mainly Indian rhesus macaques (Bontrop et al. 1995).

DNA isolation and direct sequencing of -DQA1, -DQB1, and -DPB1

Genomic DNA was extracted from EDTA blood samples or from immortalized B lymphocytes by a standard salting out procedure. Partial sequences of exon 2 for -DQA1, -DQB1, and -DPB1 were obtained by direct sequencing of PCR products according to procedures previously described (Doxiadis et al. 2003).

Cloning and sequencing of -DRB

Cloning and sequencing of -DRB exon 2 were performed as described earlier (Doxiadis et al. 2003) with the following modifications: The PCR program included a final step of 30 min at 72°C to produce a 3′-end extension by Taq polymerase, and the InsT/Aclone cloning kit (Fermentas, St.Leon-Roth, Germany) was used for direct cloning of PCR products. The PCR products were purified and ligated into the vector pTZ57R, which had been pre-cleaved by Eco321. After transformation in Escherichia coli Xl-blue, plasmid clones containing inserts were used to prepare DNA for cycle sequencing with the ABI Prism BigDye Terminator Cycle Sequencing Ready Reaction kit v3.1 (Applied Biosystems, Foster City, Calif., USA). Sequencing reactions were run on the ABI 3100 genetic analyzer (Applied Biosystems) and data analyzed using the Sequence Navigator program (Applied Biosystems) as previously described (de Groot et al. 2004).

STR genotyping

The STRs used were D6S291, D6S2741 (alias G2.56412), D6S2876 (alias G51152), DRA-CA (alias D6S2883), MICA, MOG-CA (alias D6S2972), D6S276, and D6S1691. Primer sequences, concentration in PCR reactions, fluorescence labels, and source references are shown in Table 1. The cycling parameters in PTC100 thermal cyclers (MJ Research, Waltham, Mass., USA) consisted of an initial denaturation for 5 min at 90°C of a mixture containing only DNA template and primers. After this step, the remaining reagents were added, and the program continued with four cycles of 1 min at 94°C, 30 s at 58°C, 30 s at 72°C, followed by 25 cycles of 45 s at 94°C, 30 s at 58°C, 30 s at 72°C. A final elongation step at 72°C was performed for 30 min. Multiplex PCR mixtures in a total volume of 12.5 μl contained 2.5 mM MgCl2, 0.20 mM of each dNTP, PCR buffer II, and 0.5 U AmpliTaq polymerase (Applied Biosystems). PCR products were run on ABI PRISM 377 DNA sequencers (Applied Biosystems), and genotypes were determined using GeneScan-350 ROX size standard (Applied Biosystems) and the STRand computer software for fragment size analysis (available at http://www.vgl.ucdavis.edu/informatics/Strand/). Allele sizes were rounded to the nearest integer number.

Table 1 Characteristics of major histocompatibility complex (MHC)-linked short tandem repeat (STR) markers and details for multiplex PCR amplification with fluorescence labeled primers

Cloning and sequencing of STRs

Rhesus macaque sequences for each of the STRs were obtained by cloning PCR products from each of two heterozygous animals using a TOPO TA cloning kit and according to the manufacturer’s recommendations (Invitrogen, Carlsbad, Calif., USA). To avoid sequencing of stutter bands, 12–18 colonies for each STR were first screened with fluorescence-labeled primers to select plasmid clones containing inserts corresponding to the alleles defined by the genotype of each animal. Sequencing of two to four alleles for each STR was done by cycle sequencing with an ABI Prism BigDye Terminator Cycle Sequencing Ready Reaction kit, version 3.1 (Applied Biosystems). Sequencing reactions were run on ABI Prism 377 DNA sequencers (Applied Biosystems), and sequences were analyzed using the SeqMan module of the DNASTAR software suite (DNASTAR, Madison, Wis., USA). A representative sequence for each STR was deposited in GenBank (accession numbers AY786541–AY786548).

STR polymorphism and linkage analyses

Animals from CNPRC representing unrelated Indian- and Chinese-origin rhesus macaques were used to characterize the polymorphism of STR loci. The computer program GENEPOP, version 3.1b (Raymond and Rousset 1995), was used to estimate allele frequencies and heterozygosities, and to test conformity to Hardy–Weinberg expectations (HWE). Polymorphism information content (PIC) was calculated according to Botstein et al. (1980).

Linkage analyses were performed with CRIMAP (Green et al. 1990), and the BUILD function with a LOD threshold of 3 was used to construct a map of the region based on genotype data of the eight STRs for seven paternal half-sib families.

Results and discussion

Characteristics of MHC-linked STRs

BLAST comparisons of rhesus macaque STR sequences with the GenBank database revealed high similarity to the human orthologues for D6S276, D6S291, and D6S1691 (data not shown). These sequences were not represented in the published rhesus macaque MHC sequence (Daza-Vamenta et al. 2004) but, in agreement with predictions from the human genome map, we assumed that these STRs flank the MHC region of rhesus macaques, with D6S291 located at the centromeric end and D6S276/D6S1691 at the telomeric end. Sequences for D6S2741, D6S2876, DRA-CA, MICA, and MOG-CA had high similarity to sequences in rhesus macaques MHC bacterial artificial chromosome (BAC) clones, as well as to the human orthologues (data not shown).

Allele frequencies in Indian and Chinese rhesus macaques for the eight STRs are given in Table 2. The number of alleles per locus ranged from five (MICA) to 21 (DRA-CA). Chinese animals showed overall greater allelic diversity with a total number of alleles (TNA) of 102 and average heterozygosity of 0.82, whereas in Indian monkeys, TNA was 89 and average heterozygosity was 0.77. These results were in agreement with findings of higher allelic diversity and average heterozygosity in Chinese than in Indian rhesus macaques for other autosomal STRs (Morin et al. 1997). Differences between the two groups were characterized by distinct allele frequency distributions rather than by the presence of population-specific alleles in either group. Except for MOG-CA and MICA, all markers were as variable in the rhesus macaques as in humans, and some appeared to be even more polymorphic (Foissac et al. 2000).

Table 2 Allele frequencies, observed (H o ) and expected (H e ) heterozygosity, and polymorphism information content (PIC) value of eight MHC-linked STRs in rhesus macaques of Indian (In=51) and Chinese (Ch=44) origin

The presence of a null allele in D6S2741 and D6S2876 in Indian and Chinese monkeys was identified through use of these markers for parentage analysis of CNPRC monkeys. A null allele for D6S2741 was also identified among BPRC animals. Null alleles, caused by sequence amplification failure because of a mismatch in primer binding sequences, are more likely to occur when heterologous primers are used for PCR amplification. Therefore, this finding is not unexpected. Some of the animals included in the population samples were known to have a null allele at these loci, and this allowed us to obtain a minimum estimate of its frequency.

Genotypic distributions were in agreement with HWE except for D6S2876 in Chinese rhesus monkeys, which showed statistically significant deviation (P≤0.01 after correction for multiple tests) explained by heterozygote deficiency. This result is most likely accounted for by undetected null alleles among the Chinese animals, which would cause an apparent deficit of heterozygous genotypes. The PIC values estimated for the eight markers (range 0.50–0.91) indicate that all loci will be highly informative for linkage- or association-based studies. The presence of null alleles in D6S2741 and D6S2876 justifies the development of rhesus macaques-specific primers for these markers to improve amplification of alleles in these loci.

MHC haplotypes

Haplotype analysis was done based on segregation of Mamu-A and -B serotypes; class II genotypes for -DQA1, -DQB1, and -DRB loci; and the eight STRs. The animals used for this analysis were members of BPRC breeding groups in which one alpha male was housed together with several females. Because each female had at least two offspring, extended parental MHC haplotypes could be defined for both parents as shown in Table 3.

Table 3 Extended MHC haplotypes of rhesus macaques

Among 58 parental chromosomes, 44 distinct, extended haplotypes were identified. Only animals related to the same founder shared identical extended haplotypes, and this accounted for the remaining 15 chromosomes. The ancestor haplotypes are depicted in Table 3 with the same color. Among Indian rhesus macaques, STR typing provided additional information regarding configurations of the MHC region that might not be evident from comparisons of more limited typing of class I and class II genes. For example, ancestor haplotypes A2777 (purple) and haplotype B2957 (orange) share the same class I and class II -DQB1, -DQA1, and -DRB alleles but differ for -DPB1 and D6S2741.

In contrast to the haplotype definition of Indian monkeys, MHC typing of non-Indian rhesus macaques is less developed. First, serotyping yields ambiguous results for lack of well-defined, specific antisera; second, molecular typing methods of MHC class I alleles are time consuming. Sequence-based class II typing, although informative and accurate, does not reflect the whole MHC. Therefore, STR typing provides a suitable method for the definition of extended haplotypes of non-Indian monkeys, as shown for Burmese animals of group 5 in Table 3. Haplotypes c and i of animals 4064 and 4050, for example, share the same Mamu-DQA1-DQB1, and -DRB alleles, whereas four STRs differentiate the two haplotypes. Another example is given by the haplotype d of monkey 4064 and e of monkey 4065, which can be distinguished only by two STR markers.

The extended haplotypes derived for BPRC animals indicated that, even in the absence of gene-specific typing, STR typing can be used to distinguish haplotypes that are identical by descent and provide information about the MHC region useful in the selection of experimental animals or analysis of experimental data.

Linkage disequilibrium

Linkage disequilibrium (LD) describes the non-random association of alleles at nearby loci more often than would be expected if the loci were segregating independently in a population (Ardlie et al. 2002; Wall and Pritchard 2003). LD association analyses have become increasingly useful to map disease genes and phenotypes to provide insight into the biology of meiotic recombination and the evolution of MHC haplotypes in humans (Carrington 1999; Huttley et al. 1999). LD association studies in rhesus macaques, particularly with genes and markers in the MHC region, could provide critical information relating to several aspects of biomedical research and comparative data regarding the biology and evolution of the MHC in that species.

Blast comparisons of STR sequences that we obtained against the BAC clone data used to construct the complete rhesus macaque MHC sequence (Daza-Vamenta et al. 2004) placed D6S2741 in clones 118H5 and 038L02 near Mamu-DPB1; D6S2876 in clones 281E18, 63B15 and 007H18 near Mamu-DQB/DQA DRA-CA in clones 370O021 and 240D05, near Mamu-DRA MICA in clones 24N14 and I88J04 near MIC1; and MOG-CA in clone 268P23 near MOG. Moreover, comparisons with other rhesus macaque sequences in GenBank confirmed that the MICA STR is located in exon 5 of MIC1, as it is in humans (Mizuki et al. 1997). These comparisons allowed the exact positioning of five markers on the genomic map of the rhesus macaque MHC (Fig. 1). The close proximity of these loci prompted us to investigate whether there were associations between Mamu alleles and adjacent STR markers that would be suggestive of LD.

Fig. 1
figure 1

Localization of short tandem repeat (STR) markers on the rhesus macaque major histocompatibility complex (MHC). The schematic map is drawn according to Daza-Vamenta et al. 2004, with the telomeric end to the left and centromeric to the right in a kilobase scale. The markers D6S291, D6S276, and D6S1691 are localized outside the core MHC region. MHC class I and II regions are partly enlarged above or below the original scale, respectively

Inspection of haplotypes showed association of alleles spanning the region D6S2876 to DRA-CA, such as the blocks defined by [D6S2876-220; DQB1*1801/DQA1*2601; DRB1*0303,DRB1*1007; DRA-CA-234] and [D6S2876-208; DQB1*0601/DQA1*0104; DRB1*0309,DRB6*0101,DRB*W201; DRA-CA-260] found in Indian monkeys (Table 4). Mamu-DRA is polymorphic and alleles of this locus are associated with certain DRB region configurations (de Groot et al. 2004). The apparent LD between DRA-CA and DRB alleles reflects these associations. In Burmese monkeys, however, [D6S2876-208; DQB1*0601/DQA1*0104] was associated with another -DRB region configuration/DRA-CA allele, consistent with findings of specific Mamu haplotypes associated with the geographic origin of rhesus macaques (Doxiadis et al. 2003).

Table 4 Linkage disequilibria between rhesus macaque class II alleles and adjacent STR markers

In contrast to the associations between DQ, DQ, DR, D6S2876, and DRA-CA alleles, strong LD could not be observed for all Mamu-DPB1 and D6S2741 alleles (Table 4). DPB1*14, an allele that to our knowledge has been found only in Burmese rhesus macaques, was exclusively associated with D6S2741-287 (Table 4, light green). DPB1*10, the most frequent allele in Indian monkeys, however, was observed with various D6S2741 alleles (Table 4, light blue). These observations are remarkable because of the close proximity of D6S2741 to the DPB1 locus in rhesus macaques. Although far more haplotypes need to be analyzed, one possible explanation for the lack of allelic association in Indian rhesus macaques could be the presence of a recombination hotspot between the -DPB1 and the marker, as has been observed in humans (Cullen et al. 2002). Alternatively, a higher mutation rate for D6S2741 could also be a factor in breaking down LD with DPB1. Further investigation is needed to determine whether such a hotspot exists in Indian but not in Burmese rhesus macaques, or whether D6S2741-287 is a more recent mutation that arose on a DPB1*14 chromosome.

No evidence of LD was found between serotypes of Mamu-A, -B and nearby STRs MOG-CA and MICA, respectively. This might be explained by the fact that only two STRs, localized in the more stable part of the MHC class I region, were analyzed. Additionally, these two STRs had the lowest level of allelic diversity and each contained one allele with frequency ≥0.50. These factors would make it difficult to detect allelic associations.

Recombination in the MHC region

Recombination in the MHC region was evaluated in two data sets. First, linkage analyses were performed with seven paternal families from the CNPRC colony with a total of 331 offspring–dam pairs to obtain recombination distance for the STR markers. The average number of phase known, informative meioses was 171±49. The sex-averaged map constructed was D6S291—3.3 cM–D6S2741—1.2 cM–D6S2876—1.5 cM–DRA-CA—0.7 cM–MICA—0.5 cM–MOGCA—2.5 cM–D6S276—0.4 cM–D6S1691. These analyses allowed us to obtain the distance of the flanking markers D6S291, D6S276, and D6S1691 relative to the core MHC STRs. Approximate estimates for location of the core MHC STRs, based on position in BAC clones and the Mamu region (Daza-Vamenta et al. 2004), placed D6S2741 at ~4,980 kb (based on the position on BAC clone 118H5), D6S2876 at ~4,680 kb (based on the position on BAC clone 281E18), DRA-CA at ~4,380 kb, MICA at ~3,320 kb, and MOG-CA at ~380 kb (Fig. 1). The distances between markers suggested recombination rates of about 0.005 cM/kb in the class II region between D6S2741 and DRACA, 0.0007 cM/kb between DRA-CA and MICA, and 0.0002 cM/kb between MICA and MOG-CA. The average rate across a 4,600-kb span between D6S2741 and MOG-CA was 0.0009 cM/kb. Although these estimates are based on few markers and preliminary, the pattern of recombination distribution may be comparable to that found in humans (Cullen et al. 2002). However, the lower recombination rates determined in the Mamu class I in comparison to the class II region may be related to the lower informativeness of, and longer physical distance between, the two markers in the class I region (MICA and MOG-CA). The complete rhesus macaque MHC genomic map will provide ample source of markers for use in more rigorous studies of recombination across the region and comparison with what is known for humans.

To obtain additional information regarding recombination in the MHC, we also analyzed segregation data for the BPRC pedigrees. A total of nine recombinants were observed (Table 5) within and adjacent to a core MHC region of about 4.7 Mb spanning DPB1 to MOG-CA (Daza-Vamenta et al. 2004). Two of these were localized between DRA-CA and MICA, a DNA segment of about 1.1 Mb separating class I and class II genes (Table 5, offspring 98004, group 4 and 98039, group 1). A third recombinant mapped between the DR loci and MICA (Table 5, offspring r00035, group 4). Since the DRA-CA allele was not informative in this offspring, positioning of the crossover before or after this marker was not possible. Because of the strong LD between this STR and the -DRB loci, the break most probably occurred between DRA-CA and MICA. Three recombinants separated marker D6S291 from DPB1 (Table 5, offspring 97030, 96090, and 95005, group 5). Three crossovers were observed telomeric of the class I region. One of these separated Mamu-A from D6S276/D6S1691 (Table 5, offspring 97012, group 7) but was not informative for MOG-CA. Two crossovers separated MOG-CA from D6S276/D6S1691 (Table 5, offspring 97042, group 5, and 99016, group 7). Similar to the results obtained for CNPRC families, the crossover events identified in the BPRC families occurred primarily at the ends of the MHC region. No recombinants were observed in the region spanning DQ-DRA loci, except perhaps for the one questionable case for which DRA-CA was not informative.

Table 5 Recombinations observed in rhesus macaque breeding groups

Conclusions

In this report, we characterized the polymorphism of eight STRs within or near the MHC and mapped their location in the rhesus macaque genomic sequence. Comparison of allelic diversity and frequency in Indian- and Chinese-origin rhesus macaques provided additional evidence of population differentiation that has been documented in the literature for STRs not linked to the MHC, blood proteins, and MHC class II genes. The observed differences for MHC-linked STRs between different rhesus macaque populations most likely underlie biological variation in adaptive and innate immunity and further justify efforts for more detailed characterization of the MHC region in this species. We also showed that these highly polymorphic markers were useful to help define extended haplotypes across the MHC, to identify haplotypes that were identical by descent, and to differentiate chromosome configurations that would otherwise appear identical, or nearly so, on the basis of limited gene-specific typing.

Segregation analyses suggested variable recombination rates across the MHC region in a pattern similar to that of humans. Because large pedigrees can be obtained from breeding colonies, more-detailed studies of recombination in rhesus macaques are possible and could provide comparative data regarding the evolution of the MHC within and between related species. The apparent LD between class II genes and adjacent STRs spanning the DQ-DR region suggested that inclusion of these and additional markers surrounding this region may be useful to define haplotypic blocks and may help in mapping genes or chromosomal segments associated with disease.

Genetic testing is increasingly used to establish or validate pedigree records and to manage breeding colonies of captive animals in primate centers around the world. The incorporation of MHC-linked STR typing as part of this routine will enhance the genetic characterization of captive-bred rhesus macaques and will help in the production and careful selection of experimental animals, particularly with respect to such an important genomic region as the MHC. The complete MHC sequence for rhesus macaques will make it possible to identify a plethora of other markers, STRs, or single nucleotide polymorphisms, that will further contribute to understanding the role of different MHC regions in immune-related processes.