Background

Most apicomplexan protists cause important diseases of humans and other animals, including malaria (Plasmodium spp.), toxoplasmosis (Toxoplasma gondii), cryptosporidiosis (Cryptosporidium spp.), cyclosporiasis (Cyclospora cayetanensis), coccidiosis (Eimeria spp.), babesiosis (Babesia spp.), theileriosis (Theileria spp.) and neosporosis (Neospora caninum) [1]. These parasites, with the exception of Cryptosporidium spp. [2], possess a unique, essential, vestigial plastid known as the apicoplast. The discovery of the photosynthetic apicomplexan Chromera (chromalveolate hypothesis), the structural characteristics of the apicoplast/plastid genome and phylogenetic analysis of the glyceraldehyde-3-phosphate dehydrogenase and cytochrome c oxidase subunit 2 genes have demonstrated that the apicoplast has evolved through secondary endosymbiosis of a red alga [3,4,5,6]. The apicoplast is also involved in critical metabolic pathways such as fatty acid, iron-sulphur, haem and isoprenoid biosynthesis. Previous studies have shown that Plasmodium falciparum, Babesia bovis and Babesia bigemina cannot grow with the antibiotic fosmidomycin that causes loss of the apicoplast [7, 8]. Interestingly, isopentenyl pyrophosphate supplementation completely reverses death following treatment with fosmidomycin [8]. The apicoplast and some of these metabolic pathways are vital for parasite survival, thus making the apicoplast an attractive target for anti-parasitic drugs.

Babesia spp. are tick-transmitted haemoprotozoa with a worldwide distribution that have been reported to affect domestic animals, wildlife, companion animals and humans [9]. These pathogens cause fever, anaemia, haemoglobinuria and jaundice in acute infections, while chronic infection is asymptomatic. Babesiosis of cattle and small ruminants also has great economic importance due to reduced meat and milk production, and costs associated with treatment, prevention and the disposing of carcasses. The main causative agents of babesiosis in small ruminants are Babesia ovis and Babesia motasi, which are transmitted by Rhipicephalus spp. and Haemaphysalis spp and are distributed in Asia, Africa, South America, Europe and the Far East [9,10,11]. To date, several ovine Babesia species or geographical isolates, including Babesia sp. Xinjiang (BspXJ), Babesia sp. Dunhuang (BspDH), B. motasi Lintan (BmLT), B. motasi Tianzhu (BmTZ), B. motasi Hebei (BmHB) and B. motasi Ningxian (BmNX), have been isolated from small ruminants in China. They have significant differences in vector specificity, serology, virulence and pathogenicity. Importantly, BspXJ and BspDH are transmitted by Hyalomma anatolicum and mainly cause subclinical infections, whereas BmLT, BmTZ, BmHB and BmNX are transmitted by Haemaphysalis spp. and cause mild to severe clinical signs [12,13,14,15,16,17,18,19,20]. Seroepidemiological surveys by enzyme-linked immunosorbent assay (ELISA) have shown that the prevalence of Babesia sp. and B. motasi are 30.4–31.7% and 36.0–43.5%, respectively [19, 21,22,23]. These data indicate that ovine Babesia spp. are widespread in China.

Thus far, sequencing and annotation of the apicoplast genomes have been performed for Plasmodium spp., T. gondii, Leucocytozoon caulleryi, Cyclospora cayetanensis, Theileria parva, Eimeria tenella, B. bovis, Babesia microti, Babesia orientalis, BspXJ and BmLT. These studies have revealed that apicoplast genomes lack the genes for metabolic function and regulatory proteins, but are present SNPs and repeats [24,25,26,27,28,29,30,31,32,33,34,35,36,37]. Apicoplast genomic data have been used to understand and explore the biological features of the pathogens, and the taxonomy and evolutionary relationships among apicomplexan parasites [33, 36]. However, there is limited information on the apicoplast genomes of Babesia species infective to small ruminants in China. Although the apicoplast genomes of BspXJ and BmLT have been sequenced and characterized using Illumina technology directly from genomic DNA, they were not verified using PCR amplification and sequencing [36].

In the present study, we employed a Sanger dideoxy chain-termination method to sequence the apicoplast genomes of BspXJ, BspDH, BmLT, BmTZ, BmHB and BmNX, and used bioinformatics software to conduct the assembly and annotation of full sequences of these isolates to verify the published sequences of BspXJ and BmLT sequenced using Illumina technology. Phylogenetic trees were constructed using the apicoplast genomic data to determine the taxonomic relationships among these geographical isolates of Babesia infective to sheep and goats in China. The study provides essential data for clarifying the classification of the Babesia species infective to small ruminants and lays the foundation for performing research on metabolic pathways and identifying diagnostic markers and drug targets for Babesia infections.

Methods

Parasites and isolation of genomic DNA

BspXJ, BspDH, BmLT, BmTZ, BmHB and BmNX were isolated from splenectomised sheep that were infected or infested with field-collected sheep blood or ticks from Xinjiang, Hebei and Gansu (Lintan, Tianzhu, Ningxian and Dunhuang counties) provinces during the period 2000–2010. The purified merozoites of BspXJ, BspDH, BmLT, BmTZ, BmHB and BmNX were provided by the vectors and vector-borne diseases laboratory in Lanzhou Veterinary Research Institute, China [38]. Genomic DNA was extracted from the merozoites using a QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany), according to the manufacturer’s instructions. The DNA concentration and quality were measured using the 260/280 nm absorbance ratio on a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). DNA was stored at − 20 °C until PCR amplification.

Amplification and sequencing of apicoplast genomes

The PCR primers were designed on the basis of the reported apicoplast genomic sequences of BspXJ (KX881914) and BmLT (KX881915), sequenced using Illumina technology [36]. The primer details are shown in Additional file 1: Table S1. Overlapped fragments covering whole apicoplast genomes were amplified from the genomic DNA of BspXJ, BspDH, BmLT, BmTZ, BmHB and BmNX. The PCR products were directly sequenced using a BigDye Terminator v.3.1 cycle sequencing kit (Applied Biosystems, Foster City, CA, USA) on an ABI 3730 DNA analyser (Applied Biosystems) or cloned into pBluescript II SK(+) using a Clone Express® II One Step Cloning Kit (Vazyme, Nanjing, China) for subsequent sequencing.

Assembly and annotation of the apicoplast genomes

The CLC Genomics Workbench v.7.5.1 (Qiagen, Redwood City, CA, USA) was used to assemble the apicoplast genomes according to the user manual. Genome annotation was performed using the software Artemis [39] and BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The use of these two methods guarantees the robustness of the annotation. The E-value of BLAST is 0.0 and we were blasting against a non-redundant protein sequence (nr) database. The tRNA genes were identified using tRNAscan-SE v.2.0 (http://lowelab.ucsc.edu/tRNAscan-SE/) with the default search mode and other mitochondrial sequence sources [40]. Genetic maps were obtained using the online software CGView (http://stothard.afns.ualberta.ca/cgview_server/) [41]. Mauve (http://gel.ahabs.wisc.edu/mauve) was used to generate the genome comparisons [42]. The nucleotide sequences and annotation information reported in this article were submitted to the GenBank database under the accession numbers MH992224-MH992229.

Phylogenetic analysis

The 18 apicoplast genomic sequences, six from the ovine Babesia isolates from this study and 12 from apicomplexan parasites obtained from GenBank [Babesia sp. Xinjiang (BspXJ), B. motasi Lintan (BmLT), B. bovis, B. orientalis, B. microti, Theileria parva, Plasmodium falciparum, Cyclospora cayetanensis, P. chabaudi chabaudi, Toxoplasma gondii, Eimeria tenella and Leucocytozoon caulleryi], and one chloroplast genomic sequence of Chromera velie (GenBank: HM222967), were used in the phylogenetic analysis (Table 1). On the basis of the annotation information, 13 encoding genes (rpl2, rpl4, rpl6, rpl14, rpl16, rps2, rps3, rps4, rps7, rps11, rps12, tufA and ropB) of each genomic sequence were used to deduce their amino acid sequences. The concatenated sequences of 4683 amino acid residues from each species were put into multi-alignment using Clustal W with further manual verification. Subsequently, MEGA v.6.06 (http://www.megasoftware.net/) software was applied to conduct phylogenetic analysis. A bootstrap phylogenetic tree demonstrating the relationship of BspXJ, BspDH, BmLT, BmTZ, BmNX and BmHB to other apicomplexan parasites was created by the maximum likelihood (ML) method or neighbour-joining (NJ) method, using a distance matrix corrected for nucleotide substitutions based on the JTT with Freqs model. In addition, phylogeny of the whole apicoplast nucleotide sequence was constructed by the ML method based on the Kimura 2-parameter model. A bootstrap analysis was used to assess the robustness of the clusters using 1000 replicates.

Table 1 Comparison of apicoplast genome sequences from 16 apicomplexan parasites

Results

Sequence analysis of the apicoplast genomes of ovine Babesia

Sequencing and assembly revealed that the apicoplast genomes of the six ovine Babesia isolates were formed of circular DNA, ranging from 29,916 to 30,846 bp in length with a high A + T content of 78.7–81.0% (Table 1). Bioinformatics analysis indicated that the circular DNA contained a small subunit and a large subunit ribosomal RNA (SSU and LSU), 23–24 transfer ribonucleic acids (tRNAs) and five to six ribosomal protein large subunits (rpl), eight to nine ribosomal protein small subunits (rps), four to five subunits of DNA-directed RNA polymerase (rpo), two ATP-dependent Clp proteases (clpC1, clpC2), one elongation factor TU (tufA), and seven to eleven hypothetical protein genes (hyp) (Table 2).

Table 2 Gene contents of the apicoplast genomes of six ovine Babesia isolates

All coding genes were transcribed in the same orientation (Additional file 2: Figure S1; Additional file 3: Figure S2; Additional file 4: Figure S3; Additional file 5: Figure S4; Additional file 6: Figure S5; Additional file 7: Figure S6). In total, 7094, 7220, 7384, 7436, 7163 and 7212 amino acids were encoded in the apicoplast genomes of the six ovine Babesia species. All the protein-encoding genes had ATG as a translation start codon. Most of the apicoplast protein-encoding genes had TAA as a translation stop codon, followed by TGA and TAG (Table 3). The alignment of apicoplast genomes of ovine Babesia isolates indicated that the identities of BspXJ/DH, BmLT/TZ and BmNX/HB were 99.8, 99.9 and 99.9%, respectively. BmLT/TZ and BmNX/HB had 70.9 and 71.6–71.7% identity, respectively, to BspXJ/DH, and that between BmLT/TZ and BmNX/HB was 86.9–87.0%. Based on the nucleotide sequence analysis of the whole apicoplast, there were multiple base differences among BspXJ/DH, BmLT/TZ and BmNX/HB (Additional file 8: Table S2).

Table 3 Initiation codons and termination codons used in encoding genes of the six ovine Babesia apicoplast genomes

Alignment of apicoplast genomes with those of other apicomplexan parasites

Four gene clusters were found in the BspXJ/DH, BmLT/TZ and BmHB/NX apicoplast genomes. They were in synteny with the same gene clusters of other apicomplexan parasites (Additional file 9: Figure S7). Gene cluster 1 included those encoding ribosomal proteins and tufA genes (Fig. 1). Similar to the gene organization of B. bovis, B. orientalis, B. microti, T. parva, T. gondii, E. tenella and C. cayetanensis apicoplast genomes, the six ovine Babesia isolates lacked the rpl23 gene, but it was present in P. chabaudi chabaudi and P. falciparum between the rpl4 and rpl2 genes. Hypothetical protein genes were not found in Bsp/DH, B. microti, T. parva, T. gondii and P. falciparum between rps7 and tufA, whereas BmLT/TZ, BmNX/HB, B. bovis and B. orientalis had two to seven hypothetical protein genes in the locus. Gene cluster 2 consisted of hypothetical protein genes and ClpC chaperone genes. Similar to B. bovis, B. orientalis, B. microti and T. parva, the ClpC genes of the six ovine Babesia isolates were duplicated, with both copies containing the AAA_2 ATPase domain, whereas T. gondii and P. falciparum contained one ClpC gene (Fig. 1).

Fig. 1
figure 1

Comparison of the apicoplast genomes of six ovine Babesia isolates and other apicomplexan parasites. The circular apicoplast genomes are displayed in a linear format, beginning with the small subunit rRNA genes. It was performed using SnapGene software and Adobe Photoshop. The different shades of gray represent different gene types. In detail, white represents tRNA, 25% grey represents hypothetical protein, 50% grey represents SSU and LSU, and black represents already annotated functional protein

Gene cluster 3 included RpoB, RpoC and rps2 genes, which were consistent with the gene orientation and contents of cluster 3 in B. bovis, B. orientalis, B. microti and T. parva. Cluster 3 of the piroplasma apicoplast genomes lacks the sufB gene involved in iron-sulphur cluster synthesis. Gene cluster 4 of the piroplasms had several tRNA genes and a single set of SSU and LSU genes, which are transcribed in the same orientation, whereas, this cluster consisted of two sets of SSU and LSU genes in opposite orientations in other parasites. Unlike BspXJ/DH, BmLT/TZ and BmNX/HB, one to two tRNA genes exist between the SSU and LSU genes in B. bovis, B. orientalis, B. microti and T. parva (Fig. 1).

Comparison of genomic data sequenced by the Sanger dideoxy chain-termination method and Illumina technology

The apicoplast genomic sequences of BspXJ (KX881914) and BmLT (KX881915) sequenced using Illumina technology (designated BspXJ-Illumina and BmLT-Illumina) were compared with those obtained in this study using the Sanger dideoxy chain-termination method (designated BspXJ-Sanger, MH992224; BmLT-Sanger, MH992226). The results indicated that the size of the apicoplast genome, the number of tRNA and protein-encoding genes, and the initiation and termination codons used in the encoding genes are different in the two sets of data from Sanger and Illumina (Tables 1, 3). In addition, the A + T contents and the number of rRNA molecules were consistent but there were differences in the bases at several nucleotide positions along the full-length genome between the first-generation and second-generation data (Table 4).

Table 4 Comparison of Babesia sp. Xinjiang and Babesia motasi Lintan apicoplast genome were sequenced using Illumina technology (BspXJ-Illumina and BmLT-Illumina) and the Sanger dideoxy chain-termination method (BspXJ-Sanger and BmLT-Sanger)

Phylogenetic analysis

The ML and NJ trees were constructed using the concatenated amino acid sequences of 13 gene products (4683 residues) from each apicoplast genomic sequence of 18 apicomplexan parasites and the chloroplast genomic sequence of Chromera velie (as an outgroup). The two approaches gave the same topology. In the amino acid phylogenetic trees, the piroplasms were divided into three groups: classical Babesia infective to ruminants, Theileria and B. microti. The classical Babesia species were separated into three clades: B. motasi, Babesia sp. Xinjiang/Dunhuang and B. bovis/B. orientalis. In addition, B. motasi were further divided into two subclades: B. motasi Lintan/Tianzhu and B. motasi Ningxian/Hebei (Fig. 2). In the whole apicoplast nucleotide phylogenetic tree, the grouping result for the six ovine Babesia isolates is the same as that of the amino acid phylogenetic trees (Additional file 10: Figure S8).

Fig. 2
figure 2

Phylogenetic relationships of Babesia infective to small ruminants in China and other apicomplexan parasites. Phylogeny was inferred with a maximum likelihood analysis of amino acid sequence data for 13 selected apicoplast genome-encoded genes based on distances calculated with the JTT with Freqs model. Chromera velia (HM222967) was used as the outgroup. Bootstrap values > 50% from 1000 replicates are shown on the nodes. Babesia obtained in this study is shown as triangles

Discussion

Babesia spp. are tick-transmitted haemoprotozoans that infect various animal species including humans, causing loss of livestock and public health concern in tropical and subtropical regions of southern Europe, Africa, Asia, Australia and the Americas [9, 36]. The apicoplast is a unique organelle found in apicomplexan parasites; it is considered a target for screening anti-parasitic drugs because it plays important roles in metabolic pathways for fatty acid, iron-sulphur, haem and isoprenoid biosynthesis [43]. In the present study, the apicoplast genomes of six ovine Babesia isolates were sequenced, assembled, annotated and compared with those of other apicomplexan parasites. The apicoplast genomes are smaller than those of most apicoplexan parasites [24, 28, 29, 33, 37, 44], but slightly larger than those of P. chabaudi chabaudi [35] and B. microti [25].

The content and order of genes in the cluster differ among the parasites. The rpl23, rpl11 and sufB genes are deficient in the apicoplast genomes of the six ovine Babesia spp., which is similar to B. bovis, B. orientalis, B. microti, T. parva, T. gondii, E. tenella and C. cayetanensis. It is possible that these genes were lost during the genetic evolution of the apicoplast in most apicomplexan parasites [25]. Therefore, some researchers speculate that the rpl23 and rpl11 genes do not play an important role in the growth and development of these parasites; alternatively, they may be translated at other sites in the apicoplast genome or directly encoded by the nuclear genome. The loss of the sufB gene may have been caused by the gene inversion of cluster 3 during the evolution of Piroplasmida [24, 25]. In this study, different numbers of hypothetical protein genes were found by aligning the apicomplexan apicoplast genomes, which suggests that hyp genes may be one of the causes of the variation in apicoplast genome size. Previous studies have suggested that rearrangements and loss of the genes involved in photosynthesis were thought to be responsible for the formation of apicoplast [24, 25]. Furthermore, gene deletion (sufB, rpl23, rpl11 and hyp), inversion (RNA polymerase), duplication (ClpC and hyp) and restructuring (SSU and LSU) were also important events during the early evolution of Babesia species. These modifications may have caused the apicoplast genomes of BmNX and BmHB to be the smallest among the reported Babesia in cattle and small ruminants.

Comparison of the BspXJ and BmLT apicoplast genome data sequenced by first-generation and second-generation sequencing techniques showed that there were some differences in genome sizes and the numbers of tRNA and protein-encoding genes. For example, rpl5 and rpl6 genes were present between rpl14 and rps5 genes in the second-generation sequence data of BmLT, while only the rpl6 gene exists in this region in the first-generation sequence data of the four B. motasi isolates. It is possible that these differences involving nucleotide substitutions, insertions and deletions were due to the different sequencing methods and annotation software used. Additionally, settings of a few parameters comprising ORF length, nested ORFs, start and stop codon, and genetic code were used. Therefore, it is recommended to perform first-generation sequencing to verify sequences based on the second-generation sequencing sequence, to annotate genes using two or more annotation software programs and then further improve the data by using BLAST.

Studies on the biological characteristics (including morphology, pathogenicity, vector tick, serology and in vivo or in vitro propagation) and molecular classification (target genes including SSU, ITS, LSU, HSP90, cox1, cytb, cox3, RPS8 and TRAP) of the Babesia species infective to small ruminants in China have shown that these parasites are divided into two Babesia species: B. motasi and Babesia sp. There are two further subspecies of B. motasi, named BmLT/TZ and BmNX/HB (with differences in pathogenicity, serology, in vitro culture features and target gene sequences). These findings are consistent with the Uilenberg inference [12,13,14,15, 17, 18, 20, 38, 45,46,47,48,49,50,51]. In this study, phylogenetic analyses using concatenated amino acids or whole apicoplast nucleotide sequences confirmed the taxonomical relationships among Babesia species infective to small ruminants in China. In the ML and NJ trees, the ovine Babesia fell into two groups: Babesia sp. and B. motasi. Babesia motasi was further divided into two small clades, BmLT/TZ and BmHB/NX.

The apicoplast genomes in apicomplexan Babesia parasites remain conserved throughout all the species, including the six ovine Babesia isolates, and Babesia apicoplast functions are significantly different from those of the host [25], suggesting that they might be useful as targets for the development of potent and safe therapies for the treatment of babesiosis. Our data provide useful information confirming the taxonomical relationships of these parasites and identifying targets for anti-apicomplexan parasite drugs.

Conclusions

In the present study, we have accomplished the sequencing, assembly and annotation of the apicoplast genomes of six ovine Babesia isolates from China. This study has also confirmed our previous inference that there are two Babesia species (Babesia sp. and B. motasi) infective for small ruminants in China, and that the four B. motasi isolates possibly belong to two subspecies (BmLT/BmTZ and BmNX/BmHB). Further studies are needed to validate the effects of anti-Babesia drugs against DNA replication, transcription, translation and 2-C-methyl-d-erythritol 4-phosphate pathways of the apicoplast.