Background

MicroRNAs (miRNAs) are a class of non-coding small RNA molecules, generally 21 nucleotides (nt) in length that regulate critical functions in plant and animal development. miRNAs arise from an ~70-nt stem-loop RNA precursor that is cleaved and modified by Ribonuclease III to create an ~21-nt single-stranded miRNA [1, 2]. Generally, every mature miRNA derives from one of the precursor’s arms and all are located in intergenic regions, indicating that miRNA transcription is independent of other genes and has its own transcriptional regulatory mechanisms. miRNAs regulate gene expression at the post-transcriptional level through mRNA cleavage or translational inhibition in both animals and plants [37]. In plant cells, the miRNA sequence is almost completely complementary with target gene mRNA, through a mechanism similar to RNA interference (RNAi), leading to the degradation of target gene mRNA [8]. Recently, numerous miRNAs have been reported through cloning, high-throughput small RNA sequencing, and computational approaches based on sequence similarities and secondary structure predictions [4]. Because plant miRNAs are almost exactly complementary to their corresponding target sequences [5], target gene prediction is relatively straightforward and mainly through bioinformatics method.

In 1993, the first miRNA (lin-4 gene) was reported in Caenorhabditis elegance[9]. Then, Reinhart et al. [10] found the second metachronous switch gene, let-7, in C. elegans. Recently, with bioinformatics, molecular cloning technology improvements, and the establishment of model species cDNA libraries, hundreds of small RNA molecules have been successively reported in C. elegans[11, 12], Arabidopsis [13, 14], rice [15], maize [16], wheat [17], moss [18] and in cotton [1922]. Almost 3,600 miRNAs have been annotated in Magnoliophyta including 291 from Arabidopsis thaliana and 40 from Gossypium species, as of April 21, 2012 (http://www.mirbase.org/cgi-bin/browse.pl). Recently, with the sequencing of the G. raimondii (D5) genome, many miRNA precursors and miRNAs were identified [2325]. However, their distribution on the tetraploid cotton genome is unknown. To date, few miRNAs have been functionally characterized in cotton, and almost no predicted target genes have been verified by experiments in cotton.

Cotton is not only a fundamental world commodity that provides an important natural material, it is also an important oil crop. Since the first miRNA was reported, studies suggest that plant miRNAs negatively regulate target genes involved in plant development, organ morphogenesis, auxin signaling, and environmental stress responses [14, 2630]. MiRNAs may also play an important role in cotton fiber development and the study of miRNAs has far-reaching significance to improve fiber quality and yield [31].

At this time, most miRNAs and target gene research is focused on functional verification through gene over-expression, gene interference, and related methods, but there is paucity in development of miRNA markers and their relationship with phenotypes. Only one report in the literature described miRNA markers- miRNA-AFLP (amplified fragment length polymorphism) [32], but no miRNAs or their target genes have been genetically mapped in cotton.

To address this gap and to map miRNAs to understand their genomic distribution in cotton, we downloaded pre-miRNA sequences reported in the literature and within the miRBase database (Additional file 1: Table S1). Putative target genes were predicted by psRNATarget (http://plantgrn.noble.org/psRNATarget). Specific primers were then designed based on the pre-miRNA sequences and the target sites of the target genes, and miRNA-based SRAP marker analysis was performed. Primers were also designed surrounding the target sites in the putative target genes to indicate their chromosomal distribution. Then, a network diagram between miRNAs and their putative targets was constructed to reveal regulatory relationships. Finally, RT-PCR analysis was conducted to detect expression differences between G. hirsutum and G. barbadense in selected pre-miRNAs and putative target genes.

Results and discussion

Strategies of development markers for miRNAs and their target genes

Primers were designed for 123 pre-miRNA sequences and identical primers were eliminated with a nucleotide BLAST search. Finally, 83 pairs of pre-miRNA primers were obtained (Additional file 2: Table S2). Then, these primers were screened using single-stranded conformation polymorphism (SSCP), of which 12 primers yielding 13 polymorphic loci. The rate of primer polymorphism was 14.5%.

A total of 1,255 primer pairs were obtained for the 1,399 target genes of the miRNAs after eliminating identical primers as described above (Additional file 3: Table S3). Of these, 147 primer pairs produced 161 polymorphic loci after SSCP analysis. The rate of primer polymorphism was 11.6%.

Based on miRNA-AFLP technology, the miRNA-SRAP marker technique was performed to enrich miRNA markers. Among the 2,944 miRNA-SRAP primer combinations, ~8 scorable bands per primer combination were typically produced. Exceptions were miR319-SRAP primer combinations which generated only 0–2 scorable bands. These results are consistent with those obtained from miRNA-based AFLP marker techniques [32]. Fifty-five primers produced 59 polymorphic loci (Additional file 4: Table S4), and the rate of primer polymorphism was 1.87%. No polymorphic amplification products were obtained for miR156, miR319, miR390, miR398, and miR399 when combined with SRAP primers.

Pang et al. [32] explored the possibility of using miRNA genes as markers and identified DNA sequences of potential pre-miRNA. miRNA-AFLP analysis provides a reliable targeted genotyping strategy to assess genetic diversity among cotton species. The SRAP primers can target exons and introns respectively and generate polymorphism based them [33]; they also can combine with primers desighed from candidate gene regions, which is also called TRAP (Target Region Amplification Polymorphism) [34]. Therefore, when primers designed from conservative miRNAs and their complementary sequences combining with SRAP primers, miRNAs or their flanking sequences can be amplified. This is a highly effective strategy to examine miRNA gene distribution as well as more feasible, simple, and efficient than the miRNA-AFLP technique with respect to typing miRNA markers. Also, miRNA-SRAP markers may be functional genes; therefore, the miRNA-SRAP technique has important significance to the study of miRNAs.

Verification of miRNA-SRAP marker techniques

To verify whether the miRNA-SRAP marker technique reliably amplified miRNA sequences, PCR products were cloned and sequenced from a sample of 7 random polymorphic miRNA-SRAP primer combinations. Then, miRNA primers and SRAP primers were identified as depicted in Figure 1. The figure depicts reliably amplified products using the primer combinations instead of amplified products from individual miRNA primers or SRAP primers.

Figure 1
figure 1

Verification of miRNA-based SRAP marker technology by sequencing. SRAP primers are in bold. miRNA primers are in Italic, bolded, and in shadow.

Even distribution of miRNA markers and preferential distribution of target gene markers in the cotton genome

After linkage analysis, 9 of the 13 polymorphic loci of pre-miRNAs were mapped on 7 cotton chromosomes (Chr01, Chr04, Chr13, Chr15, Chr16, Chr23 and Chr26) based on SSCP analysis which can discover minor sequence mutations and reveal more polymorphisms to map genes [35]. For miRNA-based SRAP markers, 54 of the 59 polymorphic loci were mapped on 23 cotton chromosomes, except for Chr01, Chr22, and Chr24. These data suggest that miRNA markers are widely distributed on 24 chromosomes, except for Chr22 and Chr24, for which no miRNA markers were identified. Apparently, relatively more loci occurred on Chr10 (5 loci) and Chr19 (6 loci). Thirty-two loci were mapped on 13 chromosomes of the At sub-genome, and 31 loci were mapped on 11 chromosomes of the Dt sub-genome (Additional file 5: Figure S1, Additional file 6: Table S5). In conclusion, these 63 miRNA loci were evenly distributed on 24 chromosomes, except Chr22 and Chr24.

Of the 161 polymorphic loci for the putative miRNA target genes, 156 were mapped on the cotton genome: 59 loci were mapped on 13 chromosomes of the At sub-genome, and 97 loci were mapped on 13 chromosomes of the Dt sub-genome (Additional file 5: Figure S1, Additional file 6: Table S5). Overall, unlike miRNA markers, target gene markers were preferentially distributed on the Dt sub-genome. Specifically, chromosomes Chr13, Chr18, Chr19 and Chr21 contained a total of 47 target gene marker loci, i.e., 30% of the total target gene marker loci. Comparatively fewer less loci were on Chr01, Chr03, Chr06, Chr07, Chr08 and Chr09. Surprisingly, 95 of the 156 target gene polymorphic loci belonged to target genes from the miR414 family, whereas 61 loci belonged to 26 other miRNAs families.

To date, no miRNA genes and their target genes have been genetically mapped in cotton. Pang et al. [32] developed conserved miRNA gene markers and explored the possibility of an miRNA-AFLP marker technique. Then, the amplified products were sequenced and compared for homology using cotton expressed sequence tags. However, genetic mapping of miRNA markers has not been reported. Thus, miRNAs and their target gene markers reported here were genetically mapped in our interspecific G. hirsutum × G. barbadense population. Such genetic mapping of a segregating population provided an overview of the chromosomal distribution of miRNAs and their target genes in cotton.

Network diagram revealed by relationships between miRNAs and their target genes

In order to excavate the valuable miRNAs and putative targets for further research, miRNAs and putative target genes, mapped on the interspecific BC1 genetic linkage map, were used to create a network diagram based on their regulatory relationships (Figure 2, Table 1) by Circos (http://circos.ca/). Table 1 depicts 39 miRNAs and 24 target genes that are contained in this network diagram. Some miRNAs were associated with more than one target gene, and some target genes were regulated by more than one miRNA family. Individual miRNAs and their target genes are clearly and intuitively depicted in the network diagram (Figure 2, Additional file 7: Figure S2). As shown in Additional file 7: Figure S2, miR393 and its target TC38219 (Additional file 7: Figure S2I), and miR172 and its target TC30742 were located on the same chromosome (Additional file 7: Figure S2H). Also, miR393 and its target TC38219 were proximal on the same chromosome, separated only by 1.78 cM; whereas, miR172-Me10 and its target TC30742 were separated by 30.72 cM. miR172-Me49b and its target TC30742 were separated by 11.41 cM (Table 1). Additional file 7H shows that miR172 and its target TC30742 (both located on Chr10), and another target TC41731 (located on Chr20), are found on homologous chromosomes (Chr10 and Chr20). The miR396 (Additional file 7: Figure S2K) and miR396a families (Additional file 7: Figure S2L) have the same targets as depicted in the network diagram: TC35055 and TC39199. Chromosomes on which miR396 and miR396a are located are also on homologous chromosomes (Chr07 and Chr16), and their target, TC35055, has two sites. Furthermore, TC350055a, located on Chr19, and TC350055b, located on Chr05, are found on homologous chromosomes. Additional file 7: Figure S2G depicts the same information: miR196b-Me45 was located on Chr05 and miR196b-Em10 was located on Chr19. For miR156c-DX53560 (Additional file 7: Figure S2A), its target, TC37339 was located on Chr21 and another target TC29212 was located on Chr11, which are homologous chromosomes. Additional file 7: Figure S2F depicts the location of miR169a (Chr18) and TC29763 (Chr13), which are also located on homologous chromosomes. Other miRNAs and their target genes were located on different chromosomes.

Figure 2
figure 2

Network diagram revealed by relationships between miRNA families and their target genes. Red circle: MiRNAs; black circle: Target genes. The scale marked on each chromosome represents genetic map distance (cM). miRNA primers and target primers are marked on the chromosome, and the miRNA family and target gene relationships are indicted by each color.

Table 1 miRNAs and their target genes localization on the interspecific BC 1 linkage map

In our network diagram, the relationship and distribution of miRNAs and their putative target genes on the interspecific BC1 genetic linkage map are clearly and intuitively revealed. This diagram will facilitate our understanding of the regulatory mechanisms and networks for target genes and cell development in cotton. Allen et al. [36] stated that plant miRNAs are thought to be derived from their target sequences after gene duplication, inverted duplication, and divergence in miRNA evolution. Additional file 7: Figure S2 shows that many miRNAs and their targets, miRNAs with the same target, and targets of the same miRNA family were located on homologous chromosomes or the same chromosome. These data are consistent with Allen’s published findings [36]. Other miRNAs and their targets did not reside on the same or homologous chromosomes, which may be due to evolutionary mutations or deletions or insufficient marker numbers. It was noteworthy that homologous chromosomes, Chr5 and Chr19, contained more targets and mirRNAs as depicted in the network diagram, and this may have implications for future studies of miRNAs and their targets.

Expression differences between parents by RT-PCR and qRT-PCR

We used RT-PCR analysis to investigate pre-miRNAs and their putative target genes during fiber development between G. hirsutum and G. barbadense. A total of 181 primer pairs were used for RT-PCR analysis, including 83 pre-miRNA primers and 98 randomly selected target gene primers. Twenty-five pre-miRNA genes (30.1%) were expressed in cotton fibers. Among them, 88% of these expressed pre-miRNAs were significantly different among various stages of fiber development (0, 5, 10, 15, 20, and 25 DPA) between Emian22 and 3–79. Ten pre-miRNAs had weak or no expression in both Emian22 and 3–79 (defined as no expression differences in this study). Furthermore, 13 pre-miRNAs had similar expression patterns, and 9 pre-miRNAs had different expression patterns (Figure 3). Figure 3 also shows that 40% of the pre-miRNA genes had higher expression in early ovary development.

Figure 3
figure 3

RT-PCR analysis of pre-miRNAs between mapping parents. Numbers at the top represent 0DPA, 5DPA, 10DPA, 15DPA, 20DPA and 25DPA of Emian22 and 3–79, respectively. Similar expression tendencies between Emian22 and 3–79 were classified into similar expression patterns. Obvious differential expression tendencies between Emian22 and 3–79 were classified into differential expression patterns. Obvious differential expression levels between Emian22 and 3–79 were classified into obvious difference. Minor differential expression levels between Emian22 and 3–79 were classified accordingly. Gene primers are labeled on the left.

Regarding target genes, 66 (67.3%) were expressed in cotton fiber (Additional file 8: Table S6). Among them, 50 were obviously differentially expressed; 14 had minor differences; and 2 were not different (Figure 4). Overall, 75.8% of the expressed target genes were significantly different at various fiber development stages (0, 5, 10, 15, 20, and 25 DPA), when comparing Emian22 and 3–79. Three target genes were weakly expressed or not expressed at all stages in both Emian22 and 3–79 (defined as no expression differences in this study). Furthermore, 39 target genes had similar expression patterns. Of the 25 target genes with different expression patterns, 13 were up-regulated and 12 were down regulated in 3–79, compared with Emian22. To further confirm RT-PCR data, target genes belonging to different categories were randomly chosen for qRT-PCR analysis (Additional file 9: Figure S3). Consistent results were observed between both RT-PCR and qRT-PCR analyses.

Figure 4
figure 4

RT-PCR analysis of target genes between mapping parents. Numbers at the top represent 0DPA, 5DPA, 10DPA, 15DPA, 20DPA and 25DPA of Emian22 and 3–79, respectively. Similar expression tendencies between Emian22 and 3–79 were classified into similar expression patterns. Obvious differential expression tendencies between Emian22 and 3–79 were classified into differential expression patterns. Obvious differential expression levels between Emian22 and 3–79 were classified into obvious difference. Minor differential expression levels between Emian22 and 3–79 were classified accordingly. Gene primers and their corresponding miRNA families are labeled on the left, and only one miRNA family is listed. More family information was shown in Additional file 8.

In this study, a diagram was constructed to depict pre-miRNAs and their corresponding target genes (Additional file 10: Figure S4). Additional file 10: Figure S4 shows that miR156c-DX535560, miR472-DT527030, and miR482a-DR457519 were similar with respect to their targets in various fiber development stages of Emian22. However, no correlation was observed between pre-miRNA and its target expression in various fiber development stages of 3–79. MiR157-DT558905, miR395a-DT567568 and miR395a-DT567568 had contrasting expression tendencies with their targets in various stages of Emian22, but no correlation was observed between the expression of pre-miRNA and its targets in various fiber development stages of 3–79. MiR172-TC124126, miR413-AW187128, and miR414a-DR452834 and their targets had similar expression tendencies in various parent stages. MiR414d-DW514431 had contrasting expression tendencies with its targets in various stages of Emian22 and had similar expression tendencies in various stages of 3–79.

In summary, we report dynamic expression of pre-miRNAs and putative target genes at various fiber development stages in G. hirsutum and G. barbadense. Most expressed pre-miRNAs (88%) and their putative targets (75.8%) were obviously different with respect to expressions at different stages between the two parents. Unique expression patterns of pre-miRNAs and their putative targets may be connected with a particular function. We also found that pre-miRNA families and their putative targets had correlated expression tendencies in various growth stages, suggesting that miRNA genes may regulate target gene expression. However, the biological relevance of this must be investigated further to better understand regulatory mechanisms and the overall network of plant growth control.

Conclusions

In this study, miRNAs and target markers were developed. We found that although the primers were specific, pre-miRNA primer polymorphisms were low (14.5%), and target primer polymorphisms were even lower (11.7%). The 63 miRNA loci were evenly distributed on 24 chromosomes, except Chr22 and Chr24. Also, 156 target gene loci were preferentially distributed on the Dt sub-genome and some chromosomes. Genetic mapping in a segregating population provides an opportunity to examine distribution of miRNAs and their target genes. A network diagram depicting miRNAs and their targets confirmed that plant miRNAs may be derived from their target sequences after gene duplication or inverted duplication [36]. Comparative expression analyses of Gossypium hirsutum and G. barbadense revealed that miRNAs and their target genes play a role in cotton fiber development.

Methods

Marker development

Pre-miRNA sequences reported in the literature and a database (http://www.mirbase.org) were downloaded [3743] (Additional file 1: Table S1). Putative target genes were predicted and downloaded with psRNATarget (http://plantgrn.noble.org/psRNATarget/). Because one target gene may relate to more than one miRNA, repeat target sequences were deleted. Finally, a total of 123 pre-miRNA sequences and 1,399 putative target sequences were obtained. Primers were then designed based on the pre-miRNAs and the target sites of the putative target genes by Primer 3.0 (examples in Figure 5). Criteria for primer design were a primer length of 18–25 bp (20 bp optimum), GC content of 35–70% (50% optimum), annealing temperature of 50–65°C (55°C optimum), and PCR product size of 100–1,000 bp.

Figure 5
figure 5

Primer design strategies for miRNAs and target genes. (a) miR414d from DW514431 (Genbank acc No. DW514431); (b) target gene of miR414d from TC31963 (Cotton Gene Index No. TC31963, http://compbio.dfci.harvard.edu/tgi/cgi-bin/tgi/gimain.pl?gudb=cotton). *: Target sequence; >>>>>: Primer region.

Because the polymorphism is low in cotton, to genetically map more miRNAs, SRAP primers, including 64 forward primers and 64 reverse primers [44] were combined with 23 miRNA degenerate primers [32] in the form of miRNA-SRAP analysis. Finally, a total of 2,944 primer combinations were applied in this study.

PCR amplification and electrophoresis

PCR of pre-miRNAs and their putative target genes was performed in a solution (10 μL) containing 25 ng DNA template, 1 × Buffer, 2.0 mmol L-1 MgCl2, 0.25 mmol L-1 dNTPs, 0.16 μmol L-1 of forward primer, 0.16 μmol L-1 of reverse primer, 0.8 units of Taq DNA polymerase, and ddH2O was added to 10 μL. The PCR program was performed using the following profile: 95°C for 5 min, followed by 34 cycles consisting of 94°C for 50 sec, 56°C for 45 sec, and 72°C for 60 sec; and a final extension step of 5 min at 72°C. Then, PCR products were separated on an 8% non-denaturing polyacrylamide gel at a constant voltage (15 W) for about 4 h at room temperature. After electrophoresis, DNA fragments were detected by silver staining, coloring in a sodium hydroxide and formaldehyde solution.

PCR of an miRNA-based SRAP marker was performed in a solution (10 μL) containing 30 ng DNA template, 1 × Buffer, 2.0 mmol L-1 MgCl2, 0.2 mmol L-1 dNTPs, 0.1 μmol L-1 of forward primer, 0.1 μmol L-1 of reverse primer, 0.5 units of Taq DNA polymerase, and ddH2O was added to 10 μL. The PCR program was performed using the following profile: 94°C for 5 min, followed by 4 cycles of 94°C for 1 min, 35°C for 1 min, and 72°C for 1 min; then by 34 cycles of 94°C for 1 min, 50°C for 1 min, and 72°C for 1 min; and a final extension step of 10 min at 72°C. Then, PCR products were separated on a 6% denaturing polyacrylamide gel (29:1 acrylamide and N, N-methylene bisacrylamide) at a constant voltage (80 W) for about 2 h at room temperature. After electrophoresis, DNA fragments were detected by silver staining, coloring in a sodium hydroxide and formaldehyde solution.

Genetic mapping

Our BC1 mapping population ([Emian22 × 3–79] × Emian22) [45, 46] was used to map miRNAs and their putative target genes. All primers with polymorphisms between parents were used to enrich the interspecific BC1 genetic linkage map [46]. Polymorphic loci were integrated into the interspecific BC1 linkage map using JoinMap V3.0 [47]. The logarithm of odds (LOD) threshold was 5.0. Map distances in centi Morgans (cM) were calculated using the Kosambi mapping function [48]. The linkage map was drawn by MapChart V2.2 software [49].

Drawing method of network diagram

In this study, miRNAs and their putative target genes were mapped in cotton chromosomes. The distribution of miRNAs and their putative target markers was obtained from the genetic linkage map. MiRNA loci and their putative target loci that were mapped on the chromosomes were then “connected” using individually colored lines to indicate each miRNA family’s connection to their putative target genes. Then, a network diagram was obtained based on the distribution of miRNAs and their target markers on genetic linkage map by Circos (http://circos.ca/).

RT-PCR and qRT-PCR analysis

To determine the expression difference of pre-miRNAs and their putative target genes during fiber development between G. hirsutum and G. barbadense, RNAs were extracted from developing fibers at 0 days post anthesis (DPA), 5 DPA, 10 DPA, 15 DPA, 20 DPA and 25 DPA, and RNAs (4 μg) were reverse-transcribed to cDNA by the M-MLV-RT Reverse Transcriptase (Invitrogen). RT-PCR and qRT-PCR analyses were performed according to the methods described by Munis et al. [50] with minor modifications. Ubiquitin (GenBank acc No.: DQ116441) primer pair (forward 5′GAAGGCATTCCACCTGACCAAC3′, reverse 5′CTTGACCTTCTTCTTCTTGTGCTTG 3′) [51] was used as an internal standard showing the equal amounts of first-stranded cDNA in each sample. In addition, qRT-PCR experiments were biologically repeated three times.