Background

Mitochondria are essential organelles within cells and are responsible for energy transduction, metabolism, cell growth and survival [13]. Inside mitochondria, there is a genome called mitochondrial (mt) genome. Mt genomes are present in almost all eukaryotic cells and have remarkable variations in size, structure, and organization [46]. The largest mt genome has been found in muskmelons with an estimated size of 2400 kb [79]. The smallest mt genome of only 6 kb in length has been reported in an apicomplexan parasite (Plasmodium) [10, 11].

The structure of mt genome contains two major types, the linear form and the circular form. The circular forms are usually present in animal mt genomes with the size ranging from 15 kb to 20 kb, containing 12-13 protein-coding genes, 22 transfer RNA (tRNA) genes and two ribosomal RNA (rRNA) genes, and gene arrangements in the genomes are extremely stable [12]. The linear forms have been documented in many apicomplexan parasites, including Plasmodium, Babesia, Theileria and Eimeria[5, 10, 11, 13]. Compared with animal mt genomes, the mt genomes of apicomplexan parasites encode only three protein-coding genes (cytochrome c oxidase subunits I [cox1] and III [cox3], and cytochrome b [cob]) and six fragments of large subunit rRNA genes [14]. The gene arrangements are also different among animal mt genomes and apicomplexan parasites mt genome.

Meanwhile, in apicomplexan parasites, mitochondrial protein-coding genes have been extensively used as genetic markers for phylogenetic analysis at different taxonomic levels, serving as an ideal model for gene rearrangement, and evolutionary studies [13, 15]. Most of the phylogenies of apicomplexan parasites were constructed using cox1 or cob alone, however, in some cases, a combination of cox1 and cob was employed to evaluate the phylogenetic relationships [13, 16]. In addition, mt genome sequences are also very valuable for population genetic studies [1719] as reported in Plasmodium vivax, Plasmodium knowlesi, Trypanosoma cruzi[2022].

B. orientalis is a tick-borne, intra-erythrocytic protozoan parasite causing buffalo babesiosis characterized by fever, anemia, icterus, haemoglobinuria and high mortality [23, 24]. This species was first reported in 1987 and then identified as a new species named Babesia orientalis in 1997 [25]. The new species was discovered initially based on the differences in morphology, transmission, pathogenicity and endemic areas, compared to Babesia bigemina and Babesia bovis[25], and later confirmed by the phylogenetic analysis based on 18S rRNA and heat shock protein 70 (HSP70) genes [26, 27]. The disease caused by B. orientalis is one of the most important parasitic diseases of buffalo in central and south China, resulting in enormous economic losses [27, 28]. In spite of its importance, very limited information was available about this parasite, especially at the molecular level, including mt genome sequences and structures.

In the present study, B. orientalis (Wuhan strain) mt genome was determined and annotated. The structure was characterized and compared with those of related species. In addition, the evolution of structural divergence in the apicomplexan mt genomes was discussed.

Methods

Parasite cultivation

Babesia orientalis (Wuhan strain) was cultivated according to the protocol of He et al. [29]. In brief, two, 1-year-old water buffalo, free of B. orientalis infection as confirmed by microscopy and real-time PCR [29], were splenectomized 14 days prior to B. orientalis infection. Each buffalo was subcutaneously injected with 4 ml of B. orientalis-infected blood (Wuhan strain, percentage parasitized erythrocytes, PPE 1%). Blood samples were collected everyday to monitor the parasitemia until PPE reached 3%.

All the experimental animals were housed, fed and given clean drinking water in accordance with the stipulated rules for the regulation of the administration of affairs concerning experimental animals of P.R. China.

B. orientalis mitochondria DNA sequencing

The blood from experimentally infected buffalo was collected in EDTA. Parasite genomic DNA was extracted from B. orientalis-infected blood using QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. The terminal inverted repeat (TIR) regions in the beginning and end of B. orientalis mt genome, 1-181 bp and 5835-5996 bp, were obtained by using B. bovis mt genome sequence to blast the full genome sequence of B. orientalis (unpublished data). Four pairs of primers were designed based on these two mtDNA sequences (1-181 bp and 5835-5996 bp) of B. orientalis and by aligning reported mt genomes of B. bovis (EU075182 and AB499088), B. bigemina (AB499085), and T. parva (Z23263 and AB499089) (Table 1 and Figure 1). The entire mt genome was amplified by P1 and R4. In order to confirm the mt genome, four sets of PCR were processed using primers P1 and R1, P2 and R2, P3 and R3, P4 and R4, respectively (Figure 1). Amplified products were purified and then ligated into the pMD19-T vector (TaKaRa Biotechnology), and the recombinant clones were sequenced using the ABI PRISM 377 DNA sequencer by following the manufacturer’s instructions. The vector primers M13 (-47) and M13 (-48), as well as PCR primers, were used for the sequencing of mt genome.

Table 1 Primers used for cloning of Babesiaorientalis mt genome sequence
Figure 1
figure 1

Schematic representation of the mt genome of B. orientalis. Symbol … and ()in the sequence indicates nucleotide omitted. Primer sequences of P1, P2, P3, P4, R1, R2, R3 and R4 are shown in light grey. Three protein-coding genes cox 1, cox 3 and cob are shown in dark grey. The first and last six codons of each protein-coding gene are shown below the sequence. Six LSU rRNA genes from L1–L6 are showed. Terminal inverted repeats (TIR) are represented by underlining. The symbol I above the genome sequence indicates the beginning and the end of the gene fragment. The nucleotide position of each gene fragment is shown above the sequences.

Gene annotation and sequence analysis

Nucleotide sequences of B. orientalis mt genome (GenBank accession no. KF218819) were aligned with published mt genome sequences of Plasmodium falciparum (M76611), B. bovis (AB499088), T. parva (AB499089), and T. annulata (NT167255) by MAFFT (version 7) [30, 31] with manual corrections. Protein-coding genes were predicted by comparing the previously annotated sequences from these four related species. To identify putative rRNA genes, mt DNA sequences or annotated rRNA gene fragments from the four related species were used as queries by pair-wise comparison (Blastn) in NCBI. The entire B. orientalis mt genome was subjected to tRNAscan SE 1.21 (http://lowelab.ucsc.edu/tRNAscan-SE/) using Mito/Chloroplast model option and Nematode Mito model for analyzing the existence of tRNA gene. Results from both models were compared, final annotation was determined according to B. bovis mt genome annotation. B. orientalis mt genome was also annotated using appropriate mitochondrial codes in Artemis software v11 [32, 33].

Nucleotide sequence identities were calculated by pairwise comparison between 13 apicomplexan parasites including B. orientalis, another six Babesia species, five Theileria species and P. falciparum for each of the three protein-encoding genes. The data sets of cox1 (472 aa), cox3 (229 aa) or cob (358 aa) were aligned using MAFFT (version 7) employing the FFT-NS-i algorithm [31, 34]. The alignment was manually edited using BioEdit 7.1.11 [35]. The nucleotide identities were determined through BioEdit 7.1.11.

Phylogenetic analysis

The concatenated amino acid sequences of cox1, cox3 and cob from 15 apicomplexan parasites (Table 2) were used for similarity analysis. For phylogenetic analysis, cox3 was not included, because cox3 has been presented in the nuclear genome rather than in mt DNA in some species, such as T. thermophila. A free-living ciliate, Tetrahymena thermophila[36], was used as the outgroup. The best-fit model of cox1, cob, and cox1-cob combined nucleotide substitution was determined by JmodelTest 0.1.1 [37] selected by AIC calculations. A general time reversible model with a proportion of invariable sites and a gamma-shaped distribution of rates across sites (TIM1 + I + G) substitution mode was used in PAUP* v4b10 [38] to explore neighbour-joining, parsimony and maximum likelihood methods. MrBayes v3.1.2 [39, 40] was used to explore Bayesian phylogeny. Consensus trees were edited in MEGA v4.0.2 [41].

Table 2 Mitochondrial genome sequences of apicomplexa parasites used in the present study

All water buffalo studies were carried out in compliance with the regulations (No. 5 proclaim of the Standing Committee of Hubei People's Congress) approved by the Standing Committee of Hubei People's Congress, P. R. China. The animal protocols were approved by Laboratory Animals Research Centre of Hubei province and the ethics committee of Huazhong Agricultural University (Permit number: 4200696657).

Results and discussion

Characterization of B. orientalis mt genome

The full length mt genome of B. orientalis was amplified by P1 and R4, and a 5996 bp fragment was obtained. In order to confirm the mt genome sequence, four overlapping fragments were amplified by using primer sets P1-R1, P2-R2, P3-R3 and P4-R4 with expected sizes of 1749 bp, 1655 bp, 1831 bp and 838 bp, respectively. These four overlapping fragments covered the entire genome of B. orientalis (Figure 1). Each of the PCR products were then cloned into pMD19-T vector, and sequenced. A 5996 bp mt genome sequence of B. orientalis (Wuhan strain) was obtained by assembling all the sequenced fragments.

Sequence analysis indicated that B. orientalis mt genome was arranged in a linear form. It containing three protein-coding genes, cox1 (cytochrome c oxidase I), cob (cytochrome b), cox3 (cytochrome c oxidase III) and six large subunit (LSU) rRNA gene fragments, but not any tRNA genes, which is consistent with those of other apicomplexan parasites studied to date (Figures 1 and 2) [11]. The mt genomes of apicomplexan parasites usually contain terminal inverted repeat (TIR) sequences with the size of around 440-450 bp [5]. However, in the mt genome of B. orientalis, 181 bp and 161 bp TIRs were identified from the beginning and the end, respectively (Figure 1). cox1, the first and fourth rRNA large subunit fragments (L1 and L4) are encoded by one strand of the mt genome, whereas cox3, cob, the second, the third, fifth and sixth fragments of large subunit rRNA genes (L2, L3, L5 and L6) are encoded by another strand (Figure 2a). The arrangement and predicted transcriptional direction of three protein-coding genes are the same as that of B. bovis (Figure 2b), B. gibsoni (not shown), and Theileria parva (Figure 2c), however, it greatly differed from that of T. orientalis, T. equi (not shown) and P. falciparum (Figure 2d) [5].

Figure 2
figure 2

Schematic structure of mitochondrial genomes of Babesia orientalis (KF218819) (a) and three apicomplexan parasites including (b) B. bovis (AB499088), (c) Theileriaparva (Z23263) and (d) Plasmodium falciparum (M76611). Gene names above the bold line are transcribed from left to right, and those below are transcribed from right to left. White boxes indicate protein-coding genes, cox 1 (cytochrome c oxidase I), cox 3 (cytochrome c oxidase III), and cob (cytochrome b). Fragments of LSU (L1 – L6, LA – LG, 1, 2, 3, 10 and 13) and SSU (SA, SB, SD – SF, 4, 8 and 9) rRNA genes are shown by dark and light gray boxes. TIR represents terminal inverted repeats.

In this study, BLAST analysis of the entire mt genome sequence of B. orientalis to NCBI databases revealed that the mt genome sequence of B. orientalis was most similar to those of B. bovis (EU075182 and AB499088), with an identity of 87%. The other related species were B. caballi, B. bigemina and B. gibsoni, showing their mt genome sequence identities of 86%, 85% and 84% with that of B. orientalis, respectively. However, nucleotide identities were greatly different for the sequences of three protein-coding genes cox1, cox3 and cob, when compared among different species. Pairwise comparison has been carried out among 13 different apicomplexan species including B. orientialis, another six Babesia species, five Theileria species and P. falciparum (see Table 3). The nucleotide sequence differences among 13 species ranged from 38.7% to 99.8% for cox1 (Table 3a), from 28.8% to 92.5% for cox3 (Table 3b) and 32.7% to 99.9% for cob (Table 3c), respectively. These results indicated that cox1 and cob genes were more conserved than cox3 gene. The different extent of conservation in the protein-coding genes suggested that mt genome sequences could be used as a gene marker for population evolutionary studies.

Table 3 Nucleotide sequence identity (percentage) between Babesiaorientalis and 12 apicomplexan parasites

Phylogenetic analysis

The majority of phylogenetic studies in the phylum apicomplexa have utilized 18S rRNA genes [27, 42], which allow the analysis of the ancient relationships or strains and species differentiation by focusing on the highly conserved regions defining the critical secondary structure, or the more variable internal transcribed spacer regions, respectively. Because of different extent of conservation of three protein-coding genes and important function of mt genome, the extent of sequence difference might reflect the phylogenetic relationship across species of apicomplexan parasites. So the mitochondrion sequences may provide an alternative approach to conduct these studies [5, 10, 13].

To analyze the phylogenetic relationship of B. orientalis with other apicomplexan parasites, phylogenetic trees were constructed with the amino acid sequences of cox1, cob and cox1 + cob using neighbor-joining (NJ), maximum likelihood (ML), maximum parsimony (MP) and Bayesian phylogenetic methods. The trees obtained from three data sets by different methods were consistent with no significant changes in the topology or in the bootstrap values (Figures 3a,b, and c). The NJ trees constructed with cox1, cob and cox1 + cob sequences have been presented as a representative (Figure 3). All three NJ trees displayed the same topology with high bootstrap values. B. orientalis appeared in the Babesia clade, and it’s most close to B. bovis. However, the bootstrap value was higher in the tree constructed by cox1 + cob as compared to those of cox1 and cob. These results suggested that the combined amino acid sequences of cox1 and cob may be more reliable in studying evolutionary relationships than the sequences of single gene. The relationship of B. orientalis with other apicomplexan parasites revealed by all trees from mt genome sequences were consistent with that from the previous phylogenetic trees based on 18S rRNA and heat shock protein 70 (HSP70) gene [26, 27]. These results demonstrated that the mt genome sequences are useful for the phylogenetic studies of apicomplexan parasites.

Figure 3
figure 3

Phylogenetic analyses of cox 1 (a), cob (b) and cox 1 +cob (c) amino acid sequences of B. orientalis . Each neighbor-joining tree shows the phylogenetic relationship of protein coding genes of Babesiaorientalis mt genome determined in this study with other apicomplexan parasites. Bootstrap values at the nodes are indicating the degree of support for each cluster. Vertical lengths in each tree are not significant and are merely set for clarity. GenBank accession numbers are indicated on the right of each species name. Tetrahymena thermophila was used as an outgroup.

Conclusion

In this study, we first reported the 5996 bp linear mitochondrion genome of B. orientalis. This mt genome contains 6 LSU rRNA gene fragments and three protein-coding genes, but no tRNA gene. Gene arrangement in the mt genome of B. orientalis is similar to those of B. bovis, B. gibsoni, and T. parva, but different from those of T. orientalis, T. equi and P. falciparum. Phylogenies based on the amino acid sequences of cox1 or cob alone and cox1 + cob combined all indicated that B. orientalis is closest to B. bovis, which is in agreement with the previous phylogenetic studies of B. orientalis. The availability of the entire mt genome sequences of B. orientalis provides valuable information for future phylogenetics, population genetics and molecular epidemiological studies of apicomplexan parasites.