Background

Mitochondrial DNA (mtDNA) of vertebrate is a circular DNA molecule of 15–20 kb normally containing 13 protein-coding genes, 22 tRNA genes, two rRNA genes, one origin of replication on the light-strand (OL), and a single control region (CR). The CR is essential for the initiation of transcription and for replication of the heavy strand [1]. Most genes are encoded by the heavy (H-) strand; only the ND6 gene and eight tRNA genes are encoded by the light (L-) strand. Transcription of L- or H- strand occurs from the light-strand promoter (LSP) or heavy-strand promoter (HSP) [2, 3].

Currently, over 1700 complete mitochondrial genome (mitogenome) sequences from vertebrates are available, and although the gene order of most vertebrate mitogenomes is conserved, mtDNA gene rearrangements have been found in some groups [47]. Thus far, three models have been used to explain gene rearrangements in animal mtDNA. First, the recombination model, initially proposed for gene rearrangements in nuclear genomes, is characterized by breakage and rejoining of participating DNA strands [8]. This model has been adopted to account for changes in mitochondrial gene order in frog, bird, mussels, and others [5, 9, 10]. Another commonly accepted hypothesis is the tandem duplication and random loss (TDRL) model, which posits that rearrangements of mitochondrial gene order have occurred via tandem duplications of some genes followed by random deletion of some of the duplications [11, 12]. This model is widely used to explain gene rearrangements in vertebrate mtDNA [4, 7, 13, 14]. Lavrov et al. [15] created a model of tandem duplication and non-random loss (TDNL) to explain the gene rearrangements in two millipede mtDNA genomes (Narceus annularus and Thyropygus sp.). According to this model, the mitogenome duplicates to form a dimer genome (two monomer-mitogenomes linked head-to-tail). The duplication is then followed by gene loss determined by transcriptional polarity rather than via random gene loss [15]. Since then, this model has been used to explain the formation of only a few gene rearrangements all in invertebrate mitogenomes [1618]. To date, no vertebrate mtDNA arrangements have been fit to the Lavrov et al. [15] model.

Here we describe the complete mitogenomes of four flatfishes, Crossorhombus azureus (blue flounder), Grammatobothus krempfi, Pleuronichthys cornutus, and Platichthys stellatus, all of which belong to the superfamily Pleuronectoidea. C. azureaus and G. krempfi are members of the Bothidae family, while the other two fishes are in the Pleuronectidae family. The gene order of the G. krempfi, P. cornutus and P. stellatus mitogenomes is the same as that of a typical vertebrate. However, we have discovered a novel gene rearrangement in C. azureus mtDNA. From this mitogenome, a new model of gene rearrangement in the C. azureus lineage is inferred.

Methods

Sampling, DNA extraction, PCR and sequencing

Specimens of C. azureus (C. azu) were collected from Zhuhai of Guangdong province, G. krempfi (G. kre) from Xiangshan of Zhejiang province, P. cornutus (P. cor) and P. stellatus (P. ste) from Qingdao of Shandong province. A portion of the epaxial musculature was excised from fresh specimen and immediately stored at −70°C. Total genomic DNA was extracted using the SQ Tissue DNA Kit (OMEGA) following the manufacturer’s protocol. Based on alignments and comparisons of complete mitochondrial sequences of flatfishes, dozens of primer pairs were designed for amplification of the mtDNA genomes (Additional file 1: Table S1). More than 30 bp of overlapping fragments between tandem regions were used to ensure correct assembly and integrity of the complete sequence.

PCR was performed in a 25 μl reaction volume containing 2.0 mM MgCl2, 0.4 mM of each dNTP, 0.5 μM of each primer, 1.0 U of Taq polymerase (Takara, China), 2.5 μl of 10× Taq buffer, and approximately 50 ng of DNA template. PCR cycling conditions included an initial denaturation at 95°C for 3 min, 30–35 cycles at 94°C for 45 s, an annealing temperature of 45–55°C for 45 s, and elongation at 68–72°C for 1.5-5 min. The PCR reaction was completed by a final extension at 72°C for 5 min. The PCR products were purified with the Takara Agarose Gel DNA Purification Kit (Takara, China) and used directly as templates for cycle sequencing reactions. Sequence-specific primers were further designed and used as walking primers for both strands of each fragment with an ABI 3730 DNA sequencer (Applied Biosystems, USA). The sequences of the mtDNAs of C. azureus, G. krempfi, P. cornutus and P. stellatus have been submitted to GenBank under the accession numbers JQ639068, JQ639069, JQ639071, NC_010966, respectively.

Sequence analysis

Sequenced fragments were assembled to create complete mitochondrial genomes using CodonCode Aligner v3 and BioEdit v7 [19]. During the processing of large fragments and walking sequences, regular manual examinations were made to ensure reliable assembly of the genome sequence. Annotation and boundary determination of protein-coding and ribosomal RNA genes were performed using NCBI-BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Transfer RNA genes and their secondary structures were identified using tRNAscan-SE 1.21 [20], setting the cut-off values to 1 when necessary. The gene maps of each of the four flatfish mitogenomes were generated using CGView [21]. Mitogenomes of eight other Pleuronectoidea fishes were retrieved from GenBank (Additional file 2: Table S2), including one Scophthalmidae specimen, Scophthalmus maxima (S. max); one Paralichthyidae fish, Paralichthys olivaceus (P. oli); and the other six Pleuronectidae fishes: Kareius bicoloratus, Verasper variegatus (V. var), Verasper moseri (V. mos), Hippoglossus hippoglossus (H. hip), Hippoglossus stenolepis (H. ste), and Reinhardtius hippoglossoides (R. hip).

Results and discussion

The genomes of C. azureus, G. krempfi, P. cornutus, and P. stellatus are all circular molecules of 1,6790 bp, 1,6599 bp, 1,7469 and 1,7103 bp, respectively, and each contains 37 genes, as is typical for vertebrate mtDNAs (Figure 1, Additional file 3: Table S3 and Additional file 4: Figure S4).

Figure 1
figure 1

Gene map of the mitochondrial genome of C. azureus.

Novel gene order in the C. azureusmitogenome

The arrangement of the 37 genes in G. krempfi, P. cornutus and P. stellatus mtDNA is identical to that of a typical vertebrate (Additional file 4: Figure S4). A striking finding in this study is that eight genes of the C. azureus mitogenome have a novel position differing from that of any other vertebrate mitogenome. In the blue flounder, the ND6 and seven tRNA genes (the Q, A, C, Y, S 1 , E, P genes) encoded by the L-strand have been translocated to a position between tRNA-T and tRNA-F. Thus, with one exception, the genes with identical transcriptional polarities are clustered in the genome and separated by two non-coding regions. The exception is the L-strand-encoded tRNA-N gene located in a region with genes of the opposite transcriptional polarity (Figure 1). Interestingly, the original order of the rearranged genes, Q-A-C-Y-S 1 -ND6-E-P, is maintained (Figure 2). Analysis of 1750 vertebrate mitogenomes available in GenBank (as of Nov. 2012) revealed that none had a cluster of more than five genes encoded by the L-strand. Thus, the arrangement of genes in the blue flounder mitogenome appears to be unique in vertebrates. One additional translocation is noted: tRNA-D (encoded by H-strand) is translocated from its typical location between COI and COII to a position following CytB (Figure 2).

Figure 2
figure 2

Comparison of gene order between C. azureus and the typical fish mitogenome. Arabic numerals indicate the relative order of rearranged genes on the L-strand: Q-A-C-Y-S1-ND6-E-P.

CR variation in the C. azureusmitogenome

The CRs of G. krempfi, P. cornutus, and P. stellatus are located between tRNA-P and tRNA-F, as is typical, with lengths of 891 bp, 1,778 bp and 1,400 bp, respectively. Comparison of these CR sequences with those of seven other flatfishes reveals that the CR structure is typical for teleosts [2225], including Termination-Associated Sequences (TAS-1, 2) and Conserved Sequence Blocks (CSB-2, 3). TAS-1 includes a typical TAS-complementary TAS block sequence (TAS-cTAS: TACAT-ATGTA) (Figure 3, Additional file 5: Figure S5). However, only a 263 bp non-coding fragment (NC-1) remains in the original CR location in the C. azureus mitogenome (Figure 1), and none of the TAS, CSB, or any other conserved sequences was observed. Another non-coding region of 687 bp (NC-2) was found between the tRNA-D and tRNA-Q genes, including possible TAS-1 and CSB-2 (Figures 1, 3, and Additional file 5: Figure S5). Accordingly, we consider NC-2 to be a part of the CR. However, CSB-3 and typical downstream sequences observed in other flatfish were not found (Figure 3, Additional file 5: Figure S5). Generally, the LSP and HSP are situated between the CSB and tRNA-F[1, 3]. The lack of downstream sequences implies the loss of LSP and HSP in this partial CR.

Figure 3
figure 3

Aligned CR sequences of ten Pleuronectoidea fish and the NC-2 sequence of C. azureus . The boxed sequences indicate the Termination-Associated Sequences (TASs; Grayed sequences represent TAS-cTAS box) and Conserved Sequence Blocks (CSBs). The underlined block indicates the sequences of NC-2 lost in C. azureus. Abbreviation of fish names is given in Methods.

Location and sequence variations of OL region in the C. azureusmitogenome

The OL sequences in G. krempfi, P. cornutus, and P. stellatus were found between tRNA-N and tRNA-C in the tRNA gene cluster known as the WANCY region (the tRNA cluster of tRNA-Trp, Ala, Asn, Cys and Tyr) as is typical for vertebrates [2629]. These OL sequences have the potential to fold into stable stem-loop structures with 13- or 14- bp stems and 13-, 14-, and 15-base loops (Figure 4). However, due to translocation of the tRNA-A, C, and Y genes in the C. azureus mitogenome, the WANCY region of this mitogenome contains only an 8-bp intergenic spacer between tRNA-N and COI genes, and is thus unable to form the stem-loop structure of the OL. OL sequence loss has also been seen in some vertebrate mitogenomes, where it has been suggested that a sequence encoding a tRNA adopts a hairpin structure and acts as the OL[3032].

Figure 4
figure 4

Stem-loop structures of the O L in the P. cornutus , G. krempfi , and P. stellatus mitogenomes; and putative substitute of the O L for the C. azureus mitogenome .

Gene rearrangement mechanism for the C. azureusmitogenome

Generally in vertebrate mitogenomes, small-scale gene rearrangements are rare and genomic-scale changes occur even less frequently [7], especially in teleostean fishes [28, 3335]. It is difficult, therefore, to propose a mechanism to account for the observed changes in genome structure. Gene rearrangement events are usually explained by the recombination or TDRL models [7]. The genes of the C. azureus mitogenome are extensively rearranged with clustering of eight of nine genes on the L-strand in the same polarity in an unchanged relative order. These special features provide a foundation on which to suggest a mechanism for gene-rearrangement in the C. azureus mitogenome. Though the gene rearrangement seen in C. azureus can be explained by recombination, TDRL or other models, using these models to explain observed C. azureus rearrangements is not as parsimonious as the model proposed below. For instance, to apply the recombination model to the C. azureus mitogenome, more than four recombination events would be required and each recombination event would need to translocate certain L-strand coding genes to the specific position at L-strand coding gene cluster. Since it is known that among the teleost fishes even single gene rearrangements caused by recombination are rare, this model seems an unlikely fit to the data. Similarly, using the tRNA mis-priming model [36] would require five or more specific tRNA mis-priming events. Lastly, apply tandem duplication “random loss” (TDRL) to the C. azureus mitogenome, the “loss” events, from the duplicated genome to the C. azureus type, shared very peculiar characteristic: only the L-strand coding gene including ND6 and tRNA of P, E, S, Y, C, A and Q was translocated and grouped together. Instead, the rearrangement of the C. azureus genome including two groups of genes with different transcriptional polarities is better explained by the following model.

Because the gene order of 11 of 12 flatfish mitogenomes discussed in this paper (Additional file 2: Table S2) is the same as the typical arrangement, including one member of the Bothidae family, G. krempfi, we hypothesize that the ancestral mitochondrial gene arrangement in C. azureus (in the family Bothidae) was that of a typical vertebrate (Figure 5A). We further hypothesize that the processes leading to the observed blue flounder gene arrangement are as follows. The first step would have been a duplication of the entire mitogenome, resulting in a dimeric molecule with the two monomers linked head-to-tail (Figure 5B). The genes and CRs of the dimeric mtDNA are assumed to have retained their functions at this time, so that transcription could be initiated normally at the promoters (LSP1 and HSP2, LSP2 and HSP1) and transcription would be terminated at tRNA-L (UUR) for the L-strand and at part of the CR close to tRNA-T for the H-strand [3739] (Figure 5B). Subsequently, the functionality of the promoters in one of the control regions (assumed to be LSP2 and HSP2) was lost or severely impaired due to mutation or fragment loss, thus the genes controlled by the disabled promoters (LSP2 and HSP2) would become pseudogenes (grayed regions, Figure 5C). These pseudogenes could then accumulate additional mutations to become shorter non-cording sequences or even be lost from the genome (Figure 5D). Consequently, the genes transcribed from LSP1 to tRNA-L(UUR) 1 (gene block1: P 1 , E 1 , ND6 1 , S 1 , Y 1 , C 1 , N 1 , A 1 and Q 1 ) would be clustered together, and the other genes transcribed from HSP1 to part of the CR (gene block 2: F 2 , 12S 2 , V 2 ,……ND5 2 CytB 2 , T 2 ) would also be clustered, with the exception of the retention of tRNA-N 2 gene which clusters with genes of the opposite transcriptional polarity (Figure 5C,D).

Figure 5
figure 5

Inferred intermediate steps from the ancestral gene order to that of the C. azureus mitogenome. Protein-coding genes and CRs are indicated by boxes, and the tRNA genes are indicated by columns. Genes labeled above the diagram are encoded by the H-strand, those below the diagram by the L-strand. The LSP and HSP indicate the light-strand and heavy-strand promoters, respectively; CSB indicates Conserved Sequence Block. The direction of transcription is shown by arrows. The copied tRNA-N are marked by triangles. (A) ancestral gene order; (B) The dimeric molecule with two monomers linked head-to-tail; The locations of LSP1, 2, HSP1, 2 and tRNA-L(UUR) 1, 2 , 5’ end of CR indicate the proposed positions for transcription initiation and termination of the two monomers. (C) Functional loss of LSP2, HSP2; broken line indicates the disabled transcription regions; Dark gray box indicates the degeneration of LSP2, HSP2 and related genes. (D) Proposed translocation of tRNA-D is shown by arrow. (E) Gene order of the C. azureus mitogenome.

The tRNA-N gene is located in WANCY region adjoining OL and Seligmann and Krishnan [32] speculated that it not only was transcribed into tRNA-N, but also could form OL-like structures that may have functioned during mitochondrial replication of the L-strand. Therefore, although the tRNA-N2 should not be transcribed in the process shown in Figure 5C, it was still preserved because it functioned as OL or assisted in OL functioning during L-strand replication. In the following processes, due to degradation of tRNA-L(UUR) 1 (the termination of L-strands transcription 1), transcription would be terminated at tRNA-L(UUR) 2 instead of at L(UUR) 1 . Hence, the gene tRNA-N 2 could be re-transcribed (Figure 5D). Finally, the tRNA-N 2 gene was preserved while N 1 was lost. Lastly, the gene tRNA-D was translocated from between COI and COII genes to a site between tRNA-T and CR. This event can be explained by tRNA mis-priming model or recombination event. Such translocations had been found in vertebrate and are relatively common in metazoan mitochondrial genome rearrangements [4, 10, 40]. Translocation of tRNA-D could have occurred either before or after the duplication and loss events postulated above. After the above rearrangements, a hybrid monomer-mitogenome (gene block1 and block2) would have been formed, in which genes with identical transcriptional polarity were placed into two clusters separated by two noncoding regions (Figure 5E).

Details and support for the model

The inferred “dimer-mitogenome” intermediate of the C. azureus mtDNA (Figure 5B) could be formed by two entire mitogenomes or from two longer mtDNA fragments that include all L-coding genes (namely from tRNA-Q to CR, Figure 5A). While the duplication of a very large fragment is unusual in vertebrate mitogenomes, the dimeric mitogenome molecule has been observed in many animals [17, 41, 42] including almost all mammals [43]. Therefore, a duplication of the complete genome is more likely than the duplication of a very large fragment.

The inferred intermediate rearrangement for the C. azureus mitogenome is similar to that of the TDNL [15]. The crucial step in both models is that one set of light and heavy strand promoters lost function. The two non-coding regions (NC-1, NC-2) present in the C. azureus mitogenome provide evidence for this intermediate step. When comparing the CR structure with those of other fishes, we found that the 687 bp NC-2 region includes possible TAS-1 and CSB-2 sequences, but not the LSP or HSP (after CSB; Figure 3). This feature provides evidence that one set of transcriptional promoters in the CR lost function (Figure 5C). To date, no conserved sequences of the LSP and the HSP have been found in teleostean fishes. However, the logical position of the promoters in the C. azureus mitogenome would be in NC-1 for the following reasons. First, most researches [1, 37, 38] agrees that the HSP and LSP must be located very close to tRNA-F and the 5’ end of the 12S rRNA gene. NC-1 is the closest region to those genes. Second, NC-1 is located where the two gene clusters are separated by their transcription polarities, allowing transcription to originate in both directions (Figure 5D). According to previous studies, the LSP and HSP must be located in a non-coding region not far from 3’ end of CSB (close to the origin of replication for the H-strand: OH) because the RNA primer from LSP to OH is necessary for mitochondrial replication [1, 44]. Again, NC-1 is the closest, sufficiently long non-coding region located downstream of CSB (Figure 1, Additional file 3: Table S3a). In summary, the features of NC-1 support the interpretation that “the other CR retains the promoters” in our model.

Conclusions

In summary, we determined the complete mitochondrial genomes of four flatfishes, Crossorhombus azureus (blue flounder), Grammatobothus krempfi, Pleuronichthys cornutus, and Platichthys stellatus. The genes of the C. azureus mitogenome are extensively rearranged with eight of nine genes on the L-strand in the same polarity and their relative order unchanged. A mechanism similar to the TDNL model is proposed to explain the origin of these special features. The model also explains the gene-rearrangements in which genes are clustered in the same polarity (L- or H-strand coding) with their relative order unchanged.

Data accession

Sequences were deposited in the NCBI [JQ639068, JQ639069, JQ639071, NC_010966].