Background

The mammalian Major Histocompatibility Complex (MHC) harbors genes involved in overall resistance/susceptibility of animals to infectious pathogens, including viral, bacterial, internal and external parasites. Pathogens serve as sources of selection pressure to their host animals, and the hosts are forced to develop effective strategies to fight against the pathogens in various environments. Such co-evolutionary struggles may have left distinct marks in the genome of each species involved, and mammalian MHC regions have been shaped into clusters of immunological gene families by such host-pathogen interactions, probably via functional gene duplications [13]. The implications of ovine MHC molecules in providing protection against pathogens [48] and the associated structures of the artiodactyl’s MHC region in general have led to a number of studies into the sheep MHC [915].

The ovine MHC, also called ovine leukocyte antigen (OLA), is located on the long arm of ovine chromosome 20 (OAR 20q15–20q23) with a similar structure and organization to that of human and other mammals [16]. The literature shows that MHC genes play vital roles in resistance of animals to foot rot [17], parasites [9], and bovine leukemia virus [7]. To date, the majority of studies on the structure and organization of the ovine MHC have focused on the gene content and polymorphism of the class II region [1823]. Although most loci in the sheep MHC are found to be homologous to their counterparts in the human MHC [12, 21, 24, 25], there are significant differences. Examples of such differences include the DP loci in human being replaced by DY in sheep [19, 21, 26, 27], and the number of DQA loci varying significantly among sheep breeds [20, 22, 28].

Compared to human and mouse, the structure of the sheep MHC is interrupted by a piece of ~14 Mb autosome insertion, possibly via a hypothetical chromosome inversion (inversion/insertion) in the class II region, similar to that of cattle [24, 2932]. The inversion/insertion constitutes ~25% of ovine chromosome 20, which spliced the MHC class II region into IIa and IIb. The significance of such an insertion in relation to the ovine MHC functions remains unknown. The evolutionary consequence of such an event is also worthy of attention, because some of the ovine-specific MHC loci like DY, and Dsb are located near the boundary region of the inversion/insertion. We previously constructed a physical map of BAC clone contigs covering the ovine MHC except the autosome insertion region [12, 13], and a high accuracy sequence map of sheep OLA was accordingly constructed [14].

With the initial release of sheep whole genome reference sequences by the International Sheep Genomic Consortium (ISGC), much more genome sequence information is now accessible for functional and comparative studies [33]. Nevertheless, the sequence map would serve the research community even better if it is cross-referenced/checked for accuracy in DNA sequence and assembly, at least for some chromosome regions, by an alternative approach. In this regard, the detailed information is still not fully available for the gene structure, organization, and DNA sequence for the ovine chromosome region between OLA class IIa and IIb [12, 14, 27].

In this paper, we describe the construction of a BAC physical map covering the entire autosome insertion between ovine MHC class IIa and IIb. Because ovine and bovine species share the consensus structure and organization in the entire MHC region [24, 2932], we used comparative approaches to screen a sheep BAC library with 119 bovine oligo nucleotide primers designed from the bovine genomic sequences for the consensus region. The order and overlapping relationship of the identified BAC clones were determined by DNA fingerprinting, BAC-end sequencing, and sequence-specific PCR. A total of 108 effective overlapping BAC clones were selected to fully cover the region between class IIa and IIb. The physical map we constructed will help to generate ovine MHC sequencing map with a high level of accuracy, which in turn will facilitate MHC functional and comparative MHC evolution studies in ruminants.

Methods

Comparative design of oligo primers

A BAC library was previously constructed using the genome DNA from a male Chinese merino sheep, with a total of 190,500 BAC clones and an average insert length of 133 kb [12, 13]. To screen the BAC library for positive clones in the target genome region between ovine MHC class IIa and IIb, we adapted a comparative strategy to design bovine oligo nucleotide primers using the bovine reference DNA sequences in the consensus genome region [34]. At the time this study was conducted, no sheep genomic sequence was publicly available for the genome region of our concern. Bovine DNA sequences of homologous genes, exon, intron, or partial STS sequences were acquired from the NCBI website (http://www.ncbi.nlm.nih.gov/genome/sts/). Primers were designed along the bovine MHC region between class IIa and IIb, approximately 80–160 kb apart between two neighbor loci using the software Prime Primer 5.0 (Biosoft International, CA). A total of 119 bovine primer pairs were designed for screening the sheep genomic BAC library (Table1).

Table 1 Comparative bovine primers used for identification of the positive ovine BAC clones in the genome region between MHC Class IIa and IIb *

BAC library organization and screening

To facilitate large scale PCR screening, all the 190,500 clones of the BAC library were organized into 3-dimensional BAC clone pools of plates, rows, and columns. Random BAC clones from each of 496 permanent 384-well storage plates were duplicated onto a Luria-Bertani (LB) agar plate for overnight growth at 37°C, using a 384-pin Multi-Blot Replicator as tool for BAC clone duplication (V & P Scientific, Inc., San Diego, CA). The overnight E. coli colonies were then harvested and pooled for plate (n = 496), row (n = 16), or column (n = 24). The standard alkaline lyses methodology was adapted for isolation of the pooled BAC plasmid DNA and the resulting DNA was assembled into super plates for routine PCR screening [35]. The first dimension of the BAC clone pool consisted of 496 DNA samples, each representing one of 496 BAC plates (P001-P496). The second and third dimension consisted of 16 and 24 DNA samples, respectively, for the pooled 16 rows (R01-R16) and 24 columns (C01-C24) of the random BAC clones.

To screen the BAC library using each of 119 pairs of comparative oligo primer pairs, the diluted DNA from each well of the super pool plates was used as a DNA template. The individual PCR reaction was adapted in a total of 10 μl reaction volume with 50 μM of dNTPs, 1.5 mM Mg++, 0.2 μM of each primer pair, 1 × PCR buffer, and 0.1 unit of Tag DNA polymerase. The PCR products were resolved by 1.5% agarose gel electrophoresis and the specific PCR fragment band with the expected size indicated a potential positive BAC clone for the gene loci of oligo primers used. The exact location of the target clone in the BAC library was determined by sequential PCRs using the super row and super column DNA as templates, respectively.

DNA fingerprinting and contig assembling

DNA fingerprinting was performed to determine the overlapping relationship among the identified positive BAC clones [12]. DNA from the positive BAC clone was purified from host E. coli by QIAGEN column and subjected for complete restriction enzyme digestion using Hin dIII. The enzyme digested products were analyzed on 1% TAE agarose gel electrophoresis for recoding of DNA fragment patterns. The fingerprinting images were captured with UVP Labworks System (UVP Inc., Upland, CA) for systematic analysis. Restriction fragment patterns were analyzed to identify overlapping BAC clones, which were then manually assembled into draft contigs based on the modified methods of Marra [36] and Soderlund [37].

BAC-end sequencing

BAC-end sequencing was performed for the selected clones to facilitate verification of the overlapping relationships of the BAC clones. The sequencing was performed on an ABI 3730X DNA analyzer at the core facilities of the Institute of Genetics and Developmental Biology, the Chinese Academy of Sciences. The oligo nucleotide primers used for the DNA sequencing were Copycontrol pCC1BAC vector-derived sequencing primer T7 (5’-TAATACGACTCACTATAGGG3’), pCC1/pEpiFOS RP-2 (abbr. RP2) (5’-TACGCCAAGCTATTTAGGTGAGA-3’), and pCC1/pEpiFOS RP-1(abbr. RP-1) (5'-CTCGTATGTTGTGTGGAATTGTGAGC-3'). The resulting sequences were analyzed for overlapping, and used as templates for oligo primer design. Based on the sequence data generated by BAC-end sequencing, PCR primers (Additional file 1: Table S1) were designed to amplify the common genetic loci in two overlapped BACs for confirmation. Sequence-Specific PCRs (SP-PCRs) were performed in 20 μl system including approximately 2 ng BAC DNA, 0.5 U Taq DNA polymerase, 0.1 mM dNTPs, 1.5 mM Mg++, 0.25 μM each primer, and 1× PCR buffer. When necessary, the PCR products were verified by cloning the fragments into a TA vector for verifying DNA sequencing.

Assemble of the BAC clone contig

A continuous BAC clone contig was eventually assembled based on the integrated results of DNA fingerprinting, BAC-end sequencing, and sequence specific PCR amplification of the common loci on the overlapping clones. Redundant BAC clones were removed from the assembly based on the necessity and the relative contribution of each overlapping BACs on the contig. Gaps in the contig were closed by the repeated cycles of PCR screening of BAC clones, DNA fingerprinting of additional BAC clones identified, BAC-end sequencing, and SP-PCR verification. Additional effort was made to link the existing BAC clone contig to the physical map constructed previously, for a complete physical map covering the entire ovine MHC including the autosome insertion between class IIa and IIb.

For comparison of the MHC structure and organization between sheep and other mammals, multiple comparisons were performed for the representative MHC and extended DNA sequences from human, chimpanzees, mouse, cattle, and sheep. Sequence data were downloaded from the NCBI database and other related public websites designated for the sheep genomic information.

Results

Target BAC identification

We successfully identified a total of 368 positive BAC clones for ovine chromosome 20 between MHC class IIa and IIb, utilizing bovine primers designed from the consensus genome region (Table1). Out of 119 pairs of oligo primers designed, 92 pairs worked effectively to generate specific target gene fragments of the expected sizes. This approach resulted in the successful identification of positive ovine BAC clones in the target genome region, and the overall efficiency of comparative PCR reached 80%. The relatively high rate of success for the comparative SP-PCR not only facilitated our mapping efforts, but also helped to confirm the homologous nature of MHC regions between bovine and ovine species.

Organization of ~190,500 random ovine BAC clones into three dimensional super DNA pool of rows (n = 16), columns (n = 24), and plates (n = 496) significantly increased the efficiency of PCR screening of the sheep BAC library (Figure1). The whole BAC library of 8.4× genome equivalents was screened through with a maximum of 536 (=496 + 16 + 24) PCR reactions, and a positive BAC clone could be frequently identified by as few as 136 (=96 + 16 + 24) PCR reactions using the super pool DNA as templates. In addition, PCR-based BAC clone screening also helped to eliminate the need for hybridization-based screening using radioactive 32P labeling.

Figure 1
figure 1

Representative gel images on initial PCR screening of an ovine BAC library using comparative primers from the bovine sequences. Approximately 190,500 random BAC clones were organized into pooled super DNA plates of rows, columns, and plates to facilitate PCR screening. Location of a target positive BAC clone in the library was determined usually by two runs of PCRs, one for “plate” and the other for “row + column”. The procedure eliminated the need for hybridization-based screening with radioactive 32P labeling. Gel images of PCR screen band on (A): Row pool of P098 BAC plate using the primer pair S036; (B): Row pool of P056 BAC plate using the primer pair S109; (C): Row N of P083 BAC plate; (D): Row F of P162 BAC plate. M: DL2000. Sample: PCR Products. A ~ P: Number of Row. 1 ~ 16: Number of Column (only partial shown here). P: Positive control (The amplified PCR products using the sheep genome DNA as templates).

DNA fingerprinting and BAC-end sequencing

The initial order of the positive BAC clones identified was successfully determined by inferring the overlapping relationships among the clones via DNA fingerprinting, using Hin dIII for restriction enzyme digestion of the BAC clone DNAs (Figure2). Out of 368 positive BAC clones subjected for the DNA fingerprinting, 185 clones with their overlapping relationships were successfully determined. The resulting BAC contig covered the entire autosome insertion region between the MHC class IIa and IIb. After removing the redundant clones, a total of 108 effective BACs were ordered to form an overlapping BAC contig (Additional file 1: Table S1).

Figure 2
figure 2

A representative image of DNA fingerprints of the positive BAC clones for determination of overlapping relationship. The positive BAC clones identified in the previous steps were digested with Hin d III, followed by separation on a 1% agarose gel in 1× TAE buffer. The gel was stained with Ethidium Bromide (EB) for photograph with a UVP Labworks system. M: Marker of DNA size standard (1 kb plus DNA ladder from Invitrogen, San Diego, CA, USA) with the base pair (bp) sizes indicated on both sides.

For cross-checking of the clone order, BAC-end sequencing was performed for all overlapping BAC clones, and the sequences generated were used to design BAC-end oligo primers (Additional file 1: Table S1) for further verification of overlapping relationships. The sequences of 185 BAC-ends have been deposited into the NCBI database with the access number HR309252 through HR309068, corresponding to dbGSS ID 30164010 through 30163826.

Cross verification and physical map assembling

For additional cross-verification of the BAC clone orders, a total of 108 pairs of BAC-end oligo primers were designed for amplification by PCR of the common loci in two overlapping BACs (Figure3). Verification PCR confirmed the results of DNA fingerprinting at a high level of accuracy. Out of the 108 primer pairs used, 103 produced the specific PCR products with the expected size, the overall success rate reached 95% (Additional file 1: Table S1). An overlapping relationship between two BACs was further verified if the common target loci were detected from both BACs in the overlapped region. A total of five pairs of oligo primers failed to generate the specific PCR band, or failed to produce the PCR fragment at the expected size.

Figure 3
figure 3

PCR verification of the overlapping relationship between pairs of overlapping BAC clones. Pairs of overlapped BAC clones were PCR amplified using a primer pair designed based on the BAC-end sequence. The markers above the black lines define the primer pairs and the ones below the lines are numbers of positive clones used as PCR templates.

A complete physical map of a BAC clone contig for the ovine MHC region between class IIa and IIb was successfully assembled (Figure4), based on the integrated results of DNA fingerprinting, BAC-end sequencing, and confirmation PCR of the BAC ends. The fully assembled physical map was composed of 108 effective ovine BAC clones organized into a continuous contig that covered the entire region between ovine MHC class IIa and IIb (Figure4). Based on the results of DNA fingerprinting, no gaps exist in the constructed BAC clone physical map which spans approximately 14 Mb genome region of ovine chromosome 20, indicating the even distribution of BAC clones in the library we previously constructed.

Figure 4
figure 4

A 14 Mb BAC clone physical map covering the entire region between ovine MHC Class IIa and IIb. The order and orientation of BAC clones (overlapping horizontal bars with clone ID name listed above) were determined by combinations of DNA fingerprinting, BAC-end sequencing, and sequence-specific-PCR. Target gene identified by BAC-end sequencing is marked with a vertical bar along the horizontal line, with locus name listed above. The continuous BAC map is represented by three panels with the overlapping regions marked with the same colored shadows at the both ends.

Discussion

Using the comparative approaches, we successfully constructed a 14 Mb BAC clone contig map for a region in ovine chromosome 20 that harbors the MHC. Comparison between the identified ovine BAC contig and the orthologous bovine genomic region showed that the two species share essentially the same genomic structure and organization for the entire inversion/insertion between MHC class IIa and IIb (Figure5). For the available genetic loci generated via the SP-PCR and BAC-end sequencing, our results essentially confirmed the sheep genome sequence assembly presented by ISGC in the MHC region [33].

Figure 5
figure 5

Schematic presentation of MHC structures among representative mammal species. Bovine and ovine MHC is interrupted by a long piece of non-MHC insertion that divided class II into IIa and IIb subregions. The red, blue, and green color stands for MHC Class I, Class III, and Class II, respectively. The grey color gradient represents the extended Class II region. The order of loci in the extended Class II region of bovine and ovine is in an opposite orientation compared to that of human, chimpanzees, and mouse. Dash line marks the break point of a hypothetical chromosome inversion. Dashed circles indicate the hypothetical chromosome looping and the subsequent crossover occurred during the evolution of ruminants. The drawing is not to the scale.

The physical map of ovine BAC contig we constructed helped to provide additional evidence to support the hypothesis that, there was an ancient chromosome rearrangement in the ancestor of ruminants which shaped the MHC structures currently observed in the ovine and bovine (Figure5). It is obvious that the MHC region in human, mouse and chimpanzees is continuous with no interruption, but in bovine and ovine it is interrupted by a large piece of autosome insertion which divided MHC class II into IIa and IIb subregions (Figure5). Given the fact of opposite loci order and orientation for the insertion region in ovine and bovine relative to those of human and mouse, it is highly possible that an event of genetic recombination occurred to the ancestor chromosome of ruminants, probably via chromosome looping and the subsequent crossover. This possibility was suggested by researchers previously [29, 38].

Examination of the bovine DNA sequence from the public database showed that the total length of bovine MHC is ~20 Mb, including the extended Class IIb region [34]. However, the total length of the orthologous ovine MHC was ~14.3 Mb as determined in this study, which is approximately 5.7 Mb shorter than the MHC of bovine. On the other hand, the sequence of the same bovine region presented in the NCBI database is ~18 Mb in length (http://www.ncbi.nlm.nih.gov/projects/mapview/maps.cgi?taxid=9913&chr=23). These discrepancies may not likely be resolved unless highly accurate sequence maps for the entire MHC regions become available.

The reliability of the ovine BAC contig map reported here is sufficiently high in theory, partially due to the fact that the DNA fingerprinting was utilized to infer the BAC clone orders, plus the results were cross-verified by both of the BAC-end sequencing and SP-PCR amplification of the target loci. However, it is not escaped from our attention that there are 5 out of the 108 overlapping locations in the BAC map where the SP-PCR failed to generate the expected PCR products between the overlapping BAC clones (data not shown). The significance of such failure in relation to the overall quality of the map remains to be determined. The possible explanations include the error in SP-PCR primer sequences, the high level of heterogeneity or polymorphism of the target locus involved, or the mistake in the interpretation of results of DNA fingerprinting.

Combined with our previous BAC physical map for the ovine MHC, we have now assembled a completed BAC clone physical map with the inversion/insertion region included (Additional file 2: Figure S1). The physical map will help to generate an ovine MHC sequencing map with a high level of accuracy, which in turn will facilitate MHC functional studies and comparative MHC evolution studies in ruminants. DNA sequencing of the BACs is currently underway.

Conclusion

We constructed a high-density physical map for the sheep genome region between MHC class IIa and IIb via comparative approaches. A total of 108 effective ovine BAC clones were selected to form a continuous BAC contig that covers the entire non-MHC insertion. The map spans approximately 14 Mb in length, constituting ~25% of ovine chromosome 20. The entire ovine MHC region, including the autosome insertion for which the physical map has been constructed, is now fully covered by a continuous BAC clone contig. The accuracy of DNA sequences play vital roles in detailed SNP and other functional studies of MHC genes, as well as for genome evolution studies. The physical map will help to generate ovine MHC sequencing map with a high level of accuracy, which in turn will facilitate MHC functional studies, as well as the comparative MHC evolution in ruminants.