Background

Flatfishes share a common asymmetrical body and bottom-dwelling mode of life. Their high phenotypic similarity has provoked great confusion in the number and nomenclature of taxa depending on the relevance assigned to morphological features (Chapleau [1993]; Cooper and Chapleau [1998]; Hoshino [2001]). These facts have made it necessary to develop molecular markers to figure out controversial aspects of flatfish systematics.

Generally, mitochondrial DNA (mtDNA) in vertebrata consists of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes, 1 origin of replication on the light strand (OL), and a single large control region (CR). Most genes are encoded by the heavy (H-) strand while only the ND6 gene and eight tRNA genes are encoded by the light (L-) strand (Boore [1999]). Due to its simple structure, the lack of recombination, multi-copy status in a cell, maternal inheritance, and high evolutionary rate, the mtDNA has been extensively used for population genetic study and phylogenetic analysis (Miya et al. [2003]; Inoue et al. [2010]; Shi et al. [2011]).

To date, several genes of mtDNA have been used as molecular markers in the establishment of phylogenetic relationships among flatfishes, such as rRNA genes (Azevedo et al. [2008]), the cytochrome b gene (Borsa and Quignard [2001]), the control region (Tinti et al. [1999]), and their combinations (Infante et al. [2004]). Nevertheless, it has been shown that the use of limited sequence data and markers with different evolving rates may cause errors in inferences of the evolutionary relationships among taxa. In this sense, complete mitochondrial genomes have demonstrated their ability in resolving persistent controversies over higher level relationships of teleost (Miya et al. [2003]; Kawahara et al. [2008]; Inoue et al. [2010]; Shi et al. [2011]). Currently, the complete mtDNA sequences of more than 1,000 fish species have been determined (as of 21 April 2013, http://www.ncbi.nlm.nih.gov/), including 14 species from 6 families in Pleuronectiformes.

Peltorhamphus novaezeelandiae (common sole), Colistium nudipinnis (turbot), and Pelotretis flavilatus (lemon sole) belong to the family Rhombosoleidae of Pleuronectiformes (Nelson [2006]). These fishes are primarily in a South Pacific group, occurring mostly around Australia and New Zealand. Up to now, there has been controversy over the taxonomic status of the rhombosoleids. Regan ([1910]) separated Pleuronectidae into three subfamilies: Pleuronectinae, Samarinae, and Rhombosoleinae, and Hubbs ([1945]) also admitted this opinion. Then, Chabanaud ([1946]) recommended a familial ranking for Rhombosoleinae based on their three ‘highly important’ morphological characteristics. However, this recommendation was not widely accepted. Only some researchers agreed with the classification (Chapleau and Keast [1988]; Cooper and Chapleau [1998]; Guibord [2003]; Nelson [2006]), while the others kept using a subfamilial ranking (Sakamoto [1984]; Li and Wang [1995]; Schwarzhans [1999]; Evseenko [2004]).

Previously, there were no reports on the complete mitogenome of rhombosoleid fishes, and only a few mitogenomic fragments are available. In the present study, the complete mitochondrial sequences of the three rhombosoleid fishes were determined for the first time. The genomic features of these mitogenomes were analyzed and compared with other flatfish mtDNAs. The results of this study could provide useful data for the studies of mitogenome structures in fishes and the development of molecular markers to explore the classification issues within Rhombosoleidae and Pleuronectiformes in the future.

Methods

Sampling, DNA extraction, PCR, and sequencing

Fish samples were obtained from the Sydney fish market, Australia, and preserved in 75% ethanol. The ethical approval is not required because the specimen used in the present study was common marine captured economic fishes, and all fish specimens had died when we obtained them and they were sourced from commercial fisheries. Those species were not involved in the endangered list of IUCN. Total genomic DNAs were extracted from muscular tissues with DNA extraction kit (TIANGEN Biotech, Beijing, China) by following the manufacturer’s protocol.

The primers used to amplify the contiguous (Table 1), overlapping segments of complete mitochondrial genomes of the three rhombosoleid fishes were designed by aligning and comparing with previously reported mitogenomic sequences of flatfishes or other references (Palumbi et al. [1991]; Kong et al. [2009]; Shi et al. [2011]). The PCR was performed in a 25-μl reaction volume containing 0.2 mM dNTP, 0.5 μM of each primer, 1.0 U Taq polymerase (Takara, Dalian, China), 2.5 μl of 10× LA PCR Buffer II (Mg2+ Plus), and approximately 50 ng DNA template. The PCR cycling included an initial denaturation at 95°C for 3 min, 35 cycles of a denaturation at 95°C for 30 s, an annealing step at 48°C for 40 s, elongation at 68°C to 72°C for 1 to 4 min, and a final extension at 72°C for 10 min. The PCR products were detected in 1.0% agarose gels, purified with the Takara Agarose Gel DNA Purification Kit (Takara Bio Inc., Beijing, China) and used directly as templates for the cycle sequencing reactions in both directions (with ABI 3730 DNA sequencer, Life Technologies Biotechnology Corporation, Shanghai, China). Fragments that could not be directly sequenced were inserted into the pMD20-T vector (Takara), transformed into E. coli DH5α for cloning and then sequenced. The new primers were designed for walking sequencing. An overlap of more than 30 bp between the two adjacent sequences was used to ensure the correct assembly and integrity of the complete sequences.

Table 1 PCR primers for amplification of the complete mitogenome of three rhombosoleid species

The sequenced fragments were assembled into complete mitochondrial genomes using CodonCode Aligner (vers. 3, CodonCode Corporation, Dedham, MA, USA) and BioEdit (Hall [1999]). Annotation and boundary determination of protein-coding genes and rRNA genes were performed using NCBI-BLAST (http://blast.ncbi.nlm.nih.gov). Alignments with previously published mitogenomic sequences of flatfishes and other closely related bony fishes were carried out to ensure accuracy (Table 2). The transfer RNA genes and their secondary cloverleaf structures were identified using tRNAscan-SE 1.21 (Lowe and Eddy [1997]), with cutoff values set to 1 when necessary. The secondary structures of tRNA Ser(AGC) and tRNA Cys were further constructed by RNA structure (Reuter and Mathews [2010]) and confirmed by examination of their anticodons.

Table 2 Information of the mitogenomes sequences of 17 flatfishes used in present study

To compare the mitogenome characteristics among Pleuronectiformes, the complete mitogenome sequences of 14 other flatfishes from 6 families were retrieved from GenBank (Table 2).

Results and discussion

Genome organization

The lengths of the P. novaezeelandiae, C. nudipinnis, and P. flavilatus mitogenomes are 16,889, 16,588, and 16,937 bp, respectively (GenBank accession no. JQ639065, JQ639063, KC554065; note, the order of the following data is the same as these). Their gene arrangements are identical to those of typical teleost species (Saitoh et al. [2000]; Yue et al. [2006]). All the three mitogenomes contain 13 protein-coding genes, two rRNA genes, 22 tRNA genes, one OL, and one CR. All genes are encoded by the H-strand except for the ND6 and eight tRNA genes, which were encoded on the L strand (Table 3). Base compositions of the three mitogenomes are 26.6% to 28.2%, 27.1% to 28.1%, 16.7% to 17.6%, and 27.7% to 28.1% for A, C, G, and T, respectively, with a bias on AT content (54.3% to 56.1%). A total of 34-, 28-, and 66-bp intergenic spacers are found in the three genomes. The majority of spacer lengths range from 1 to 7 bp except for the two larger spacers in P. flavilatus. One is 28 bp between tRNA Leu(UUA) and ND1, and the other is 18 bp between tRNA Asp and COII, which is a polymeric C fragment. A total of 37-bp, 37-bp, and 39-bp overlaps, respectively, were observed. The four notable over-lapping positions (ATP8 and ATP6, ATP6 and COIII, ND4L and ND4, ND5 and ND6) were also observed as reported in other vertebrate species (Kong et al. [2009]) (Table 3).

Table 3 Features of the mitogenomes of three rhombosoleid fishes

Protein-coding genes

The sizes of the 13 protein-coding genes are 11,441 bp in P. novaezeelandiae and C. nudipinnis but are 11,424 bp in P. flavilatus. Comparison of the length of each gene reveals that the genes are highly conserved in size except for the ND1 gene (28 bp shorter) and the ND5 gene (13 bp longer) in P. flavilatus mtDNA. The start codons are identical in the three species. Eleven genes use the ATG, whereas COI starts with GTG, and ND3 with ATA, which has rarely been found in fish mitogenomes to date (other examples include Albula glossodonta, Monopterus albus, Petroscirtes breviceps, Solea senegalensis, and Cynoglossus semilaevis) (Miya et al. [2003]; Miya et al. [2001]; Inoue et al. [2004]; Kong et al. [2009]). Ten of the 13 genes use the same stop codons. The ND5 gene ends with TAA in P. novaezeelandiae and P. flavilatus and TAG in C. nudipinnis; the COII and cytb end with TAA in P. novaezeelandiae, but with T in C. nudipinnis and P. flavilatus (Table 3).

The base compositions of the 13 protein-coding genes are T > C > A > G. The proportions of the four bases have no apparent bias at the first codon position but have significant difference at the second and third positions. The percentage of T at the second position is up to 40.9% to 41.1%, but that of G is only 13.6% to 13.8%. In particular, G at the third position is only 9.6% to 12.5%, which is in agreement with previous reports (Saitoh et al. [2000]; Miya et al. [2003]; Oh et al. [2007]). There is a slight difference in codon usage among three rhombosoleids. The most frequently used amino acid is leucine (16.5% to 17.2%), while cysteine (0.7% to 0.8%) is the least frequently used. The level of homology of genes between the three rhombosoleid species ranges from 63% (ND2 gene) to 85% (COII gene). The similarity between C. nudipinnis and P. flavilatus is generally higher than that between P. novaezeelandiae and each of these two fishes (Figure 1).

Figure 1
figure 1

Sequence identities of 13 protein genes and two rRNA genes among the three rhombosoleids.

Ribosomal and transfer RNA genes

Two rRNA genes are typically located between tRNA Phe and tRNA Leu (UUA) and separated by tRNA Val (Table 3). The lengths of 12S rRNA genes are similar and those of 16S rRNA have approximately 15 bp differences among three species (Table 3). The level of homology of rRNA genes is very similar in 12S rRNA but slightly different in 16S rRNA among the three rhombosoleids (Figure 1).

The 22 tRNA genes are interspersed between rRNA and protein-coding genes. Most of these tRNAs are of similar length as those in other fishes, except for tRNA Glu, which is 5 bp longer in P. flavilatus than those in the other fishes. The majority of tRNA genes could be recognized and folded into secondary structures by tRNAscan-SE, except for two genes that were identified by comparing with other flatfishes. One is the tRNA Ser(AGC) gene in C. nudipinnis and P. flavilatus, which is the common case in fishes, and another is the tRNA Cys gene, in which the dihydrouracil loop cannot be formed in any of the three fishes’ mtDNA. The lengths are7, 5, 4, and 5 bp for that of the amino acid arm, anticodon arm, and DHU and TΨC arm, respectively. Both the anticodon and TΨC loop are 7 nucleotides long, whereas DHU loop size varies from 5 to 11 nucleotides.

Non-coding sequences

The OL is normally located between tRNA Asn and tRNA Cys in the WANCY region and is from 38 to 44 bp in size. These regions have the potential to fold into a stem-loop structure with 13 or 14 bp in the arms and 10 or 14 nucleotides in the loops. The highly conserved sequence motif 5′-GCCAG-3′ is substituted by 5′-GCCGG-3′ (Figure 2).

Figure 2
figure 2

The stem-loop structures of O L in the mitogenomes of three rhombosoleids. The underlined sequences indicate the conserved sequence motif. (a) P. novaezeelandiae, (b) C. nudipinnis, and (c) P. flavilatus.

The control regions are commonly situated in the location between tRNA Pro and tRNA Phe. Their lengths are quite different and are 919, 1,212, and 1,218 bp in C. nudipinnis, P. novaezeelandiae, and P. flavilatus, respectively (Table 3). These differences mainly result from the presence of tandem repeats at the 3′-end, in which the motif length and copy number of tandem repeat are variable. There are two 7-bp motifs with 6 or 39 copies in P. novaezeelandiae, a 17-bp motif with seven copies in C. nudipinnis, and a 145-bp motif with three copies in P. flavilatus (Figure 3).

Figure 3
figure 3

Alignment of the control regions of P. novaezeelandiae ( P.nov ), C. nudipinnis ( C.nud ), and P. flavilatus ( P.fla ) mtDNA. The blocks CSB-A, B, C, D, E, F, poly-T, and CSB-2 and CSB-3 are shaded. CSB, conserved sequence block. The sequence in parentheses indicates the motif of the tandem repeat, and the arabic number indicates the copy number.

The AT contents of the CRs reach up to 64.4% to 66.7%, which are higher than those of the whole mtDNA sequences. The symbolic structures of the CRs are observed as in other fishes (Figure 3), including the extended termination associated sequence (ETAS, containing TAS-cTAS: TACAT-ATGTA), central conserved sequence blocks (CSB-F,E,D), G-BOX (GTGGGGG), pyrimidine tract (poly-T), and conserved sequence blocks (CSB 2-3) (Nesbo et al. [1998]; Manchado et al. [2007]; Wang et al. [2013]).

Comparative analyses with other flatfishes

Up to now, 14 mitogenome sequences from other flatfishes had been determined (Table 2). To better understand their features of mtDNAs, a comparative analysis was carried out in several aspects.

First of all, the content of the 17 mitogenome sequences (including three from the present study) are the same, which consists of 37 genes, 1 OL, and 1 CR. However, the gene arrangements differ among them. Apart from the two tongue soles (C. semilaevis and C. abbreviatus), the orders of the 15 others are identical to that of the typical fish mitogenomes. The organization of the tongue sole mitogenomes differed, in which the tRNA Glngene is inverted from the light strand to the heavy strand (inversion), accompanied by shuffling of the tRNA Ile gene and long-range translocation of the control region downstream to a site between the ND1 and the tRNA Gln genes.

The lengths of the 17 mitogenome sequences show apparent differences (from 15,973 bp of Kareius bicoloratus to 18,017 bp of Reinhardtius hippoglossoides). The reason for the short CR in K. bicoloratus mtDNA is due to the unfinished sequencing of CR and that for two other tongue soles (16,417 bp or 16,731 bp) is due to the rearrangement of the CR. The variations for the other fishes are primarily caused by the presence of the repeated arrays in control regions.

The proportions of three bases (A\T\C) have no obvious difference, ranging from 20% to 30%; however, that of G is remarkably lower, from 14.5% (C. semilaevis) to 17.7% (Platichthys stellatus). The AT compositions in the 15 species mitogenomes are generally approximately 50% (from 53.21% to 56.08%), but those of the two tongue soles reach up to 60.35% and 60.59% (Table 2).

Gene region

The lengths of the gene sequences are relatively conservative, except for the ND2 gene (1,110 bp) of Scophthalmus maximus (approximately 50 bp longer) and the ND1 gene (948 bp) of P. flavilatus (approximately 25 bp shorter). For the other genes, no significant difference was observed.

In this study, the AT and GC skews of 13 proteins in the 17 flatfish mitogenomes were analyzed. Compositional skew was estimated using the following formulas: GC skew = (G − C)/(G + C) and AT skew = (A − T)/(A + T), where C, G, A, and T are the frequencies of the four bases at the third codon position of the eight fourfold degenerate codon families (Perna and Kocher [1995]). With the exception of AT skew in C. nudipinnis, 16 of 17 mtDNAs show a typical negative GC skew and positive AT skew. The absolute values of the GC skews are always higher than those of the AT skews, with the former ranging from 0.45 to 0.71 and the latter from 0.02 to 0.39 (Table 2). These results indicate that the usage of the G/C was more unbalanced than the A/T in the studied flatfishes.

Non-coding sequences

All the OL of the 17 flatfishes mitogenomes are situated at the typical site of bony fishes. The lengths range from 41 bp (C. abbreviatus) to 50 bp (Zebrias zebrinus) (Table 2). All these regions have the potential ability to fold into a stem-loop structure with 11 to 15 bp in the arms and 10 to 15 bp nucleotides in the loops. The highly conserved sequence motif of 5′-GCCGG-3′ is substituted in some species, such as by 5′-CCCGG-3′ in Z. zebrinus, 5′-GCCAG-3′ in rhombosoleids, and 5′-ACCGG-3′ in Paralichthys olivaceus and S. maximus (Table 2).

The control region of the 14 mitogenomes (excluding the unfinished one K. bicoloratus) is typically situated in the place between tRNA Pro and tRNA Pheexcept for the two tongue soles (which is between ND1 and tRNA Gln). The lengths are quite different, from 661 bp (C. abbreviatus) to 2,312 bp (R. hippoglossoides). The differences of length primarily result from the presence of tandem repeats within the CR. The repeat regions in Psettodes erumei exist at both 5′- and 3′- ends of the CR, while that in the others are located at either the 5′-end in the tongue soles or the 3′-end in the rest fishes. The motif lengths and copy numbers of tandem repeat are variable within species or interspecies (Table 2). The specific case is the absence of a repeat region in Z. zebrinus.

Based on the alignment of CR sequences of 14 flatfishes, the typical tripartite structure was found as those in P. novaezeelandiae, C. nudipinnis, and P. flavilatus mtDNA. The six blocks of CSB-F, E, D, C, B, and A were identified in the central conserved blocks domain, and the key sequences of each block are as follows: CSB-F: GTAAGAGCCTACCAACCGG, CSB-E: GGGTGAGGGACAAAAATT -GTGGGGG, CSB-D: TATTCCTGGCATTTGGTTCC-TACTTCAGGGCCAT, CSB-C: CTTACATAAGTTAATG, CSB-B: CATACGACTCGTTACCCAGCAAGCCGGGCGTTC; CSB-A: CTCCAGCGGGTAAGGGG. The G-box (GTGGGGG) is the most conservative in CSB-E. Simultaneously, a pyrimidine tract following the CSB-A was also identified (TTCTC-TTTTTT TTTTTCCTTTC). Two conserved sequence blocks of CSB-2 and CSB-3 at the 3′-end of the CR were identifiable; their sequences are CSB-2: AAAACCCCCC-TACCCCCCTAAA and CSB-3: CCTGAAAACCCCCCGG, respectively (see Additional file 1).

Generally, the variation of the control region is relatively greater than that of the other sequences in the mitogenome. However, there have also been some conserved structures in the CR as they are supposed to contain the functional structures, such as the origins of the H-strand (OH), the heavy strand promoter (HSP), and light strand promoter (LSP) of transcription transcripts (Shadel and Clayton [1997]). So far, the sequences of some conserved blocks have not been defined. Generally, only the TAS, CSB-F, E, and D of the central conserved blocks are identified in most fishes, while the CSB-C, B, and A are greatly variable (Lee et al. [1995]; Guo et al. [2003]; Manchado et al. [2007]; Zhang et al. [2010]), and CSB-1,2,3 have been found only in some fishes (Lee et al. [2001]; Liu [2002]; Guo et al. [2003]).

Conclusions

We sequenced the complete mitogenomes of three rhombosoleid fishes in the Pleuronectiformes. Comparative genomic analysis within the rhombosoleids revealed that the genomic feature of P. flavilatus is apparently different from the other two.

The comparison of complete mitogenome sequences of 3 rhombosoleids with that of the other 14 flatfishes show some different features among them. Firstly, the genomic arrangement of the 15 mitogenomes is identical to that of a typical teleost, but the order of two tongue soles showed clear rearrangements. Secondly, the length heterogeneity is apparently large and up to 1,600 bp. The main reason for this case is due to the presence of repeat regions in the CRs. The 15 species have tandem repeats, which were distributed at all potential existing sites in the CR, including 3′-, 5′-, or both 3′- and 5′-ends of the CR. Moreover, the motif length and copy number in intraspecies or interspecies are also variable. Thirdly, six blocks of CSB-F, E, D, C, B, and A in the central conserved blocks domain and CSB-2 and CSB-3 in the conserved sequence blocks were identified. However, CSB-1 is not conserved in the flatfishes studied. These phenomena fully indicated the diversity of repeats in flatfishes and would provide useful data for further studies on the structure of mitogenomes in fishes.

Summarily, the complete mitogenomic sequences of rhombosoleids and rich molecular information were obtained in this study. It will contribute to figuring out the existing controversy, such as the taxonomic status and phylogenetic relations of rhombosoleids. Comparative genomics analysis within flatfishes conducted here may help better understand the evolution of mitogenomic structures and explore the phylogenetic relationships of the Pleuronectiforms.

Data accession

Sequences were deposited in the NCBI [no. JQ639065, JQ639063, KC554065].

Additional file