Background

Microsatellites are short segments of DNA in which a specific motif of 1–6 bases is repeated [1, 2]. Due to their high polymorphism, codominant inheritance, ease of scoring and dense distribution throughout eukaryotic genomes, microsatellites are now generally considered to be the most powerful genetic markers for genetic mapping and evolutionary studies [3]. One perceived difficulty with microsatellites is the long lead time in identifying and characterizing microsatellites in new taxonomic groups. This problem is alleviated by developing novel protocols for enriching repeat DNA from genomic DNA [4]. However, most microsatellites are type II markers for which no known function has been established. Type I markers are associated with genes of known functions and are more useful for comparative gene mapping to study genome evolution [5] and for identifying markers associated with important quantitative traits [6]. Although SNPs in genes were identified in some fish species [7, 8], type I markers are still relatively rare in fish. Detection of polymorphic microsatellites located within transcribed genes provides a possibility to convert type II markers to type I markers [9]. Previous studies demonstrated that some microsatellites with genes were associated with economically important traits [6, 10], and could be used as markers for marker-assisted selection. Currently, microsatellites in transcribed genes have been identified in model organisms [11] and economically important animal [8, 9, 1214] and plant species [15, 16] by data mining in ESTs using bioinformatics tools or direct sequencing ESTs. However, in most of 31,000 fish species existing on the earth [17], it is difficult to obtain microsatellites in cDNA through data mining, due to the fact that no ESTs are available, or the number of ESTs is limited in these species. Although a method for enriching microsatellites from genomic DNA has been adapted to identify microsatellites from cDNA in catfish [18], the efficiency of isolation of microsatellites in cDNA is still not very high as comparing that in genomic DNA [19], due to the redundancy of cDNA. In this paper, we report a very simple and efficient method for isolating microsatellites from transcribed genes. The method included cDNA normalization, microsatellite enrichment and directional cloning of cDNA enriched with microsatellites.

Results and discussion

In a previous study [10], we sequenced 4800 ESTs from six normalized cDNA libraries of Asian seabass (Lates calcarifer). From the 4800 ESTs, a total of 70 unique sequences containing microsatellites (repeat length: dinucleotide > 7, trinucleotide > 6, tetranucleotide > 5) from 130 clones were identified. Among the 70 microsatellites, 42 were CA-repeats, 23 GA-repeats, two GGA-repeats and three other types of repeats. These data indicate that unique microsatellite sequences accounted for 1.45% (70/4800) of cDNA clones in Asian seabass. CA- and GA-microsatellites were most abundant in cDNA of Asian seabass. However, they represent only 0.83% (40/4800) and 0.48% (23/4800) of cDNA clones from normalized cDNA libraries. Hence, straightforward random sequencing of clones from normalized cDNA libraries is not efficient for discovering microsatellites.

In this study, we tried to enrich CA-, GA-microsatellites from unnormalized cDNA of Asian seabass using biotinylated (CA)10 and (GA)10 oligonucleotides, since these two types of microsatellites are most abundant in cDNA of Asian seabass. Two cDNA libraries were constructed, one enriched for CA-microsatellites and another for GA-microsatellites. From each library, 192 randomly picked clones were sequenced in both directions. Among the 192 clones from the cDNA library enriched for CA-repeats, 80 clones contained microsatellites. Of the 80 clones containing microsatellites, only 11 were singletons, and the remaining 69 were included in 8 clusters. A total of 19 (9.9%) unique microsatellites were obtained from the 192 sequences clones (Table 1). Similarly, among the 192 sequenced clones from the cDNA library enriched for GA-repeats, 40 clones contained microsatellites. Eight were singletons, and 32 were included in 6 clusters. The cDNA sequence of the parvalbumin gene beta-1 containing one CT-microsatellites [10] appeared 8 times in the 192 clones. A total of 14 (7.3%) unique microsatellites were obtained from the cDNA library enriched for GA-microsatellites. In comparison to the random sequencing of clones from normalized cDNA libraries without enrichment of microsatellites, the efficiency of microsatellite isolation from unnormalized cDNA libraries enriched for microsatellites has been raised over 10 times (for CA microsatellites: 9.9% vs. 0.83%; for GA-microsatellites: 7.3% vs.0.43%). In catfish, similar efficiency of isolation of microsatellites from cDNA was reported [18]. However, high redundancy of cDNA sequences from unnormalized cDNA libraries reduced the efficiency of microsatellite isolation from cDNA.

Table 1 Comparison of the efficiency of microsatellite enrichment from unnormalized and normalized cDNA libraries of Asian seabass

In order to increase the efficiency of enrichment of microsatellites, we tried to reduce the redundancy of cDNA by normalizing cDNA using duplex-specific nuclease (DSN) [20] before enrichment of CA- and GA-microsatellites (Figure 1). After cDNA normalization, redundant cDNA were removed (Figure 2). Two normalized cDNA libraries, one enriched for CA-microsatellites and another for GA-repeats were created. From each library, 192 clones were sequenced in both ends respectively. Eighty-eight (45.8%) and 51 (26.5%) clones of the 192 clones from the normalized cDNA libraries enriched for CA- and GA-repeats respectively, contained microsatellites (Table 1). The redundancy of clones was substantially reduced. In the 88 clones containing microsatellites from the cDNA library enriched for CA-microsatellites, 41 were singletons, the remaining 47 were included in 10 contigs. A total of 51 (26.5%) unique microsatellites were obtained from 192 sequenced clones. In the 51 clones containing microsatellites from the cDNA library enriched for GA-repeats, 35 were singletons, and 16 were included in 5 clusters. A total of 40 (20.8%) unique microsatellites we obtained from 192 sequenced clones (Table 1). In comparison to the efficiency of microsatellite enrichment from unnormalized cDNA, the efficiency was about three folds increased by using normalized cDNA (for CA enrichment: 26.5% vs. 9.9%; for GA enrichment: 20.8% vs. 7.3%). Therefore, decreasing the prevalence of clones representing abundant transcripts before microsatellite enrichment by normalization of cDNA is essential for microsatellite isolation from cDNA. The normalization of cDNA using DSN was very simple and highly efficient in comparison to other cDNA normalization methods [21]. The whole procedure of microsatellite enrichment starting from normalization of cDNA lasted only 5 days. Application of this method to isolate microsatellites from cDNA of grass carp brain got similar results (data no shown). Therefore the method is robust and reproducible.

Figure 1
figure 1

Schematic presentation of the method for microsatellite enrichment from normalized cDNA. Details of each step can be found in the section "Methods".

Figure 2
figure 2

Agarose gel electrophoresis (1%) of smart cDNA, normalized cDNA and cDNA enriched with microsatellites. Lane 1: 1 Kb ladder (NEB); Lane 2: smart cDNA; Lane 3: normalized cDNA; lane 4: 100 bp ladder (NEB); lane 5: normalized cDNA enriched for CA-microsatellites and lane 6: normalized cDNA enriched for GA-microsatellites.

Sixty of 91 microsatellites isolated from the libraries enriched for CA- and GA-microsatellites had enough flanking regions for designing primers, and were characterized in a panel of 24 individuals previously used for characterization of microsatellites isolated from genomic DNA [22]. Forty-one were polymorphic with an average allele number of 4.85 ± 0.54 ranging from 2 to 20 (Table 2), whereas the average expected and observed heterozygosity were 0.56 ± 0.03 and 0.47 ± 0.04 respectively. The average allele number of microsatellites isolated from cDNA is slightly lower than those isolated from genomic DNA libraries, and characterized with same DNA panel [22], which might be due to the relatively lower number of repeats of microsatellites identified from cDNA. Examination of genotyping errors using MicroChecker revealed no evidence for large-allele dropout or stutter-band scoring at any of the 41 loci. All 41 polymorphic microsatellites showed a Mendelian pattern of inheritance. Twenty-nine of 41 microsatellites were in HWE (Table 2). Departure from HWE at 12 loci may be caused by the presence of null alleles. However, examination of genotypes using MicroChecker showed the possibility of presence of null alleles is low (P > 0.05). Therefore, microsatellites isolated from cDNA using the described method could be useful for linkage mapping and comparative mapping and studies on genome evolution.

Table 2 Characterization of 41 microsatellites isolated from cDNA of Asian seabass

Conclusion

We have developed a very simple and highly efficient method for identifying microsatellites from cDNA. Microsatellites isolated from cDNA showed polymorphism and a Mendelian pattern of inheritance. Therefore, the method will be ideal for isolation of microsatellites from cDNA of fish species where there are no EST sequences available or the number of ESTs is limited.

Methods

Identification of microsatellites from existing ESTs of Asian seabass (Lates calcarifer)

In a previous study we sequenced 4800 ESTs from six normalized cDNA libraries of Asia seabass [10]. Microsatellites in these ESTs were identified using SciRoKo 3.1 [23]. Default parameters were used in the search for microsatellites. SciRoKo provides statistical analysis of the microsatellites.

Isolating microsatellites from cDNA

Synthesis of first strand cDNA and second cDNA

Total RNA was isolated from brain of a 3-months old Asian seabass using Trizol (Invitrogen) according to the manufacturer's protocol. DNA residue in the RNA was removed with the treatment of DNAse (NEB). First strand DNA was synthesized using CDS primer [AAGCAGTGTATCAACGCAGAGTA(T35)], Smart oligo II A primer (AAGCAGTGTATCAACGCAGAGTACrGrGrG) and PowerScript reverse transcriptase (BD Bioscience) according to BD Bioscience's protocol. The reaction consisted of 3 μg total RNA, 1 mM CDS primer, and 1 mM smart oligo II A primer, 1 × first strand buffer (BD Bioscience), 2 mM DTT and 1 mM of each dNTP in 10 μl. The reaction was incubated at 42°C for 2 hours on a PTC-100 PCR machine (MJ research) and then cooled on ice.

Second strand cDNA was synthesized using SMART cDNA technology. Briefly, the synthesized first strand cDNA was 1: 6 diluted in 1 × TE (pH 8.0), heated at 72°C for 10 min, and used in the synthesis of the second strand cDNA according to BD Bioscience's protocol. The 50 μl reaction comprised of 1 μl diluted first strand cDNA, 1 × buffer (BD Bioscience), five units of polymerase mix (BD Bioscience), 200 μM dNTPs, 0.25 μM smart PCR primer (AAGCAGTGTATCAACGCAGAGT). PCR was performed on PTC-100 PCR machine using the following program: 20 cycles of 95°C for 8 s, 68°C for 20 s and 72°C for 3 min.

Normalization of smart cDNA

Normalization of cDNA was conducted using DSN (duplex-specific nuclease) (Evrogen) [20] according to the manufacturer's recommendation. Briefly, the amplified second strand cDNA was cleaned using glassmilk (Gen101) and diluted to 50 ng/μl. Three microliter of cDNA with 1 μl 4× hybridization buffer [200 mM Hepes-HCl (pH 8.0), 2 M NaCl] was denatured at 95°C for 5 min, and then incubated at 68°C for 4 h for renaturation. After the incubation, the following reagents preheated at 68°C were added to the hybridization reaction: 3 μl water, 1 μl 5 × DSN buffer [500 mM Tris-HCl (pH 8.0); 50 mM MgCl2 and 10 mM DTT] and 0.5 μl (1 U/μl) DSN (Evrogen). The 10 μl reaction was incubated at 65°C for 30 min on a PTC-100 PCR machine followed by heating at 95°C for 8 min to inactivate the DSN. The normalized cDNA was diluted 4 times with water, and amplified with the Smart PCR primer for 20 cycles as described in the above section.

Incorporating Sal I and Not I linkers to the ends of cDNA

To produce directional microsatellite-enriched cDNA libraries, the 5' and 3' ends of the normalized cDNA were annealed a linker with a cutting site Sal I and Not I respectively by the following 50 μl PCR reaction: 1 μl (20 times diluted) smart cDNA or normalized smart cDNA, 1 × advantage 2 buffer (BD Bioscience), 200 μM dNTPs, 1 μl (5 units) advantage 2 polymerase mix (BD Bioscience), 0.15 μM NotI-T25 primer [AATGTCGAGVGGCCGCGTAC(T)25], 0.15 μM SalI primer (TTGTAGCGTCGACTCACTATC), 0.015 μM SalI smart primer (TTGTAGCGTCGACTCACTATCAAGCAGTGTATCAACGCAGA). The PCR was performed on a PTC-100 PCR machine with the following program: 20 cycles of 95°C for 8 s, 68°C for 20 s and 72°C for 3 min.

Enrichment of microsatellites

Microsatellites in cDNA were enriched by using biotinylated oligonucleotides and streptavidin-coated magnetic beads. Briefly, 1 μg cDNA in 6 × SSC was denatured at 98°C for 5 min, followed by hybridization with 1 μl 10 pmol/μl biotinylated (CA)10 or (GA)10 in 65 μl 6 × SSC at 55°C for 25 min. DNA hybridization products (65 μl) were captured with 35 μl (ca. 350 μg) streptavidin coated beads (Pierce) (suspended in 6 × SSC) which were washed twice in 1 × TE (pH 8.0) and twice in 6 × SCC before capture at room temperature. Beads capturing microsatellite-enriched cDNA were washed twice in 2 × SSC containing 0.1% SDS and twice in 1 × SSC at room temperature, and then a final wash in 1 × SSC at 55°C for five min. The captured cDNA was eluted with 30 μl water and PCR-amplified in a reaction of 25 μl consisted of 3 μl eluted cDNA, 200 nM SalI primer, 200 nM NotI-T25 primer, 200 μM dNTPs, 1 × PCR buffer, and two units of polymerase mix (BD Bioscience). The PCR was carried out on a PTC-100 PCR machine using the following program: 30 cycles of 95°C for 8 s, 65°C for 20 s and 72°C for 3 min. PCR products were cleaned and concentrated using glassmilk (Gen 101).

Directional cloning of microsatellite-enriched cDNA

The microsatellite-enriched cDNA PCR products was digested with SalI and NotI using the following protocol: 20 μl (ca. 500 ng) cleaned cDNA, 3 μl 10 × SalI buffer, 1 μl (20 units) SalI (NEB) and 1 μl (10 units) NotI (NEB) at 37°C for 2 hours. After digestion, the cDNA was electrophoresed on 1% low melt gel (BIO-RAD). Fragments between 500 bp and 1200 bp were excised and cleaned using glassmilk (Gen 101). Approximately 50 ng cDNA was ligated to 25 ng pCMV-SPORT-6 vector (Invitrogen) which was used to transform XL-blue supercompetent cells (Stratagene). Schematic presentation of the method for microsatellite enrichment from normalized cDNA is shown in Figure 1.

Sequencing of clones

White colonies were picked and arrayed into 96-well plates containing 40 μl LB liquid medium with 100 μg/ml ampicillin in each well. The 96 well plates were cultured at 37°C for 16–18 hours without shaking. Inserts of each colony were PCR amplified using two microliter cell culture in LB as template, low concentration (50 nM) of M13PUC forward (5' CCCAGTCACGACGTTGTAAAACG 3') and reverse primers (5' AGCGGATAA-CAATTTCACACAGG 3') with the following PCR program: 94°C for 5 min followed by 35 cycles of 94°C at 30 s, 55°C for 30 s and 72°C for 48 s, with a final extension at 72°C for 5 min. Two microliters of colony PCR products were directly sequenced in both directions using M13PUC-F/M13PUC-R primers and BigDye kit on an ABI3730xl sequencer (both from Applied Biosystems). Forward and reverse sequences were assembled using software Sequencher (GeneCodes). Primers were designed for a subset of microsatellites in the flanking regions using PrimerSelect (Dnastar). One primers of each pair was labeled with a fluorescent dye Hex or Fam (1st Base). DNA sequences of the polymophic microsatellites were deposited in GenBank with the accession numbers: EF210110–EF210125 and FJ535708–FJ535732.

Characterization of microsatellites isolated from normalized cDNA

PCR amplification of microsatellites was performed on a PTC-100 thermal cycler in a 25 μl reaction volume containing 100 ng DNA, 1 × PCR buffer [50 mM KCl, 10 mM Tris-HCl (pH 8.8), 1.5 mM MgCl2 and 0.1% Triton-X 100], 200 nM of each primer, 50 μM of each dNTP and one unit DNA polymerase (Finnzymes). Cycling conditions were: 94°C for 2 min followed by 35 cycles of 94°C for 30 s, 55°C for 30 s and 72°C for 30 s, with a final extension at 72°C for 5 min. Fluorescence-based genotyping of 24 unrelated Asian seabass individuals originated from Southeast Asia and Australia was conducted using an automated DNA sequencer ABI 3730xl (Applied Biosystems). Each microsatellite was examined for genotyping errors using MicroChecker [24]. Mendelian inheritance patterns of all microsatellites were examined on one of three pedigrees, each including one parental pair and 24 offspring using the chi-square test. Hardy-Weinberg Equilibrium (HWE) and linkage disequilibrium were examined using GDA [25]