Introduction

Apicomplexa comprise a eukaryotic phylum that, along with two others, Ciliophora and Dinoflagellata, form the superphylum Alveolata under the protistan kingdom Chromalveolata [1, 2]. To date, about 6,000 apicomplexan species have been named, but sequencing data of environmental samples suggest there may be millions more species belonging to this phylum [3]. Most named apicomplexans are obligate parasites, and some of them cause important human or animal diseases such as malaria (caused by Plasmodium spp.), toxoplasmosis (Toxoplasma gondii), coccidiosis in poultry (Eimeria spp.), babesiosis (Babesia spp.), theileriosis (Theileria spp.), and cryptosporidiosis (Cryptosporidium spp.). However, many apicomplexans are not pathogenic to their host.

Although apicomplexans apparently lack photosynthesis, they have a secondary plastid—the apicoplast (Fig. 1). Because of their clinical, veterinary, or economical importance, disease-related apicomplexans have been extensively researched, and in several instances, both the apicoplast and the nuclear genomes have been sequenced. Metabolisms involving the apicoplast have attracted attention as potential targets for disease-controlling drugs, since they might directly contribute to the survival of the apicomplexan cell. By contrast, housekeeping functions of the organelle have attracted less attention, though they are often unique or significantly different from those of other organisms. In this article, housekeeping functions unique to the apicoplast are mainly discussed.

Fig. 1
figure 1

Apicomplexans and the plastid. a Phylogeny of alveolates and distribution of the plastid. The phylogenetic tree was drawn based on the nuclear-encoded 18S rRNA sequences of representative species available in the databases and suggests only topological relationships between taxa. Distribution of the plastid in most Gregarinasina species has not yet been studied (see text). b Phylogeny of the plastids and their variety. Like red plastids (purple), apicoplasts (red, orange, yellow) have the genome encoding sufB, while organisms with green plastids (green) have the gene encoded in the nucleus. Unlike other plastids, the apicoplast genomes lack 5S rRNA gene (rnf). Like other plastid-bearing organisms, Toxoplasma, Eimeria, and Plasmodium have the Suf system incorporating the SufBCD complex, while Babesia and Theileria lack genes specifying the components of the complex. Toxoplasma, Eimeria, and Plasmodium have a unique hybrid-type heme pathway that involves mitochondrial ALA syntase (ALAS). Although Babesia lacks the heme pathway, it has the PBGS gene forming a tight gene cluster with the SPP gene in the nuclear genome like Toxoplasma, Eimeria, and Plasmodium. Unlike other organisms, Eimeria lacks the intron in the trnL(UAA) gene in the plastid genome

The apicoplast—the plastid of apicomplexan cells

Discovery of the plastid in apicomplexan cells

An extremely A + T rich extrachromosomal DNA was identified in Plasmodium spp. [4]. The DNA was initially believed to be mitochondrial [58], but later was shown to be related to the plastid DNA of algae and plants [9]. Other apicomplexan genera such as Toxoplasma, Eimeria, and Theileria have a similar plastid-like DNA [9, 10] sharing highly conserved coding sequences [11, 12]. However, intensive studies to detect these genes in Cryptosporidium parvum [13] and Gregarina niphandrodes [14] have failed, indicating that some apicomplexans species lack the plastid-like DNA.

McFadden et al. [15] reported that rRNA encoded by the plastid-like DNA is accumulated in a small ovoid organelle located anterior to the nucleus of T. gondii. Independently, Köhler et al. [10] showed by in situ hybridization that the plastid-like DNA is strictly localized in an organelle, and created a new term “apicoplast” that has become popular and is widely used today by abbreviating the phrase “apicomplexan plastid”. Chaubey et al. [16] reported that the EF-Tu protein expressed from the gene on the plastid-like DNA localizes in the apicoplast as distinct from the mitochondrion in P. falciparum. It is generally believed that every apicomplexan species with the plastid-like DNA has an apicoplast, though this has yet to be confirmed experimentally in genera other than Plasmodium and Toxoplasma. Perhaps some species have an apicosome—an apicoplast without organellar DNA, though such has yet to be reported.

Morphology and potential association with other organelles

The number of membranes surrounding the apicoplast is generally recognized as four [15], and no internal thylacoid-like membranous structure has ever been observed. However, Hopkins et al. [17] proposed that the P. falciparum apicoplast probably had only three membranes. They also reported that the organelle harbors two unique membrane complexes. The inner membrane complex was predicted to be a rolled-up myelin-like invagination of the innermost membrane. On the other hand, an outer membrane complex, which lies between the outermost and the middle membrane, remains of uncertain origin. Both the outer and inner membrane complexes increase in size and in complexity during the parasite’s development from merozoite to trophozoite.

Köhler [18] reported that the apicoplast membrane of T. gondii was not consistent throughout the organelle and at least one extensive sector appeared to be bordered solely by two membranes. All apicoplasts examined, including those in organellar division, have a single pocket-like invagination of about 50–200 nm in width, often located in close proximity to a voluminous evagination of the outer membrane of the parasite’s nuclear envelope.

Tomova et al. [19] isolated Sarcocystis sp. from roe deer and analyzed the ultrastructure of its apicoplast using a combination of high-pressure freezing, freeze-substitution, and electron tomography. This apicoplast had four continuous membranes: two inner membranes of circular profile with a constant distance between them and two outer membranes of irregular shape. The outermost membrane displayed protuberances into the parasite cytoplasm and was associated with the endoplasmic reticulum (ER) at ‘contact sites’. Similar contact sites were observed in T. gondii, but no fusion point was observed [20]. Thus it is uncertain if the apicoplast is connected to the ER like the secondary plastids of cryptophytes [21].

Hopkins et al. [17] mentioned that the intracellular position of the apicoplast in P. falciparum varied considerably depending on the developmental stage of the parasite, but there was a close association between the apicoplast and mitochondrion in all cellular stages, including merozoites. Kobayashi et al. [22] reported the apicoplast and the mitochondrion were inseparable from one another using two different techniques—Percoll density gradient centrifugation and fluorescence-activated organelle sorting. This agrees with the suggestion now prevalent that the apicoplast in Plasmodium is physically bound to the mitochondrion via the cytoskeleton.

The apicoplast genome

The complete gene content and map of the apicoplast genome were first determined for P. falciparum [9], and then for two coccidian species, T. gondii (Kissinger et al., unpublished: database sequence with accession no. U87145) and E. tenella [23], followed by two piroplasmids, Theileria parva [24] and Babesia bovis [25]. The gene content of the apicoplast genomes is highly conserved apart from a few lineage specific genes; the genome of each species commonly encodes SSU and LSU rRNAs (rrs and rrl), three subunits of the bacteria-type RNA polymerase (rpoB, rpoC1, rpoC2), 16 ribosomal proteins, an EF-Tu, a ClpC-like protein and 24 tRNA species, the minimum sufficient for translation without importing a tRNA from the cytosol. On the other hand, genes for DNA metabolic enzymes, the α subunit of the bacteria-type RNA polymerase (rpoA) and some ribosomal proteins are missing from the apicoplast genome. These and other genes specifying most if not all proteins involved in the organellar functions are encoded by the nuclear genome.

The apicoplast genomes of Plasmodium and coccidian genera contain an inverted repeat (IR) each half of which consists of rrs, rrl, and nine tRNA genes. Unlike those of other plastid DNAs, the rrs and rrl in each half of the IR are arranged head-to-head, and all protein-coding genes are arranged in two large clusters following the rrl genes. By contrast, there is no IR in the apicoplast genome of piroplasmids, despite the fact that it contains a very similar set of genes to Plasmodium and coccidians. All genes in the piroplasmid apicoplast genome, except for the duplicated clpC and several repetitive hypothetical ORFs, are single-copy and arranged on the same DNA strand, potentially forming one big gene cluster.

Likewise, topology of the apicoplast DNA varies depending on species. Essentially all molecules of apicoplast DNA in P. falciparum are circular and about 35 kb in size; only a minor population (about 3%) are linear [26]. On the other hand, the circular form represents only 9% of the total mass of the apicoplast DNA in T. gondii; the remaining >90% existing as linear concatemers of the 35-kb units up to dodecamer [27]. The apicoplast DNAs of E. tenella [28] and Neospora caninum [29] also have been reported to comprise linear molecules of several different sizes. The topology of the apicoplast DNA for other apicomplexan species is unknown.

Nuclear-encoded apicoplast proteins

Like those of other secondary plastids [3033], apicoplast proteins encoded by the nuclear genome generally have a bipartite organellar targeting sequence at the N terminus [34, 35]. The transit peptide following the N terminal signal sequence of the bipartite apicoplast targeting sequence of P. falciparum is rich in asparagine and lysine residues rather than the serine/threonine found in plants [34]. The transit peptide often contains putative Hsp70-binding sites, a feature that was included with earlier judging criteria to create the prediction algorithm PlasmoAP [36] which predicts that more than 500 proteins encoded by the nuclear genome of P. falciparum have an apicoplast targeting sequence at the N terminus [36, 37]. About 150 of them show significant sequence similarity to other proteins of known function or structure in the sequence databases: the most prominent include enzymes involved in de novo biosynthesis of isoprenoid, fatty acid and heme, as well as housekeeping proteins such as DNA polymerase, DNA gyrase subunits, ribosomal proteins, molecular chaperones, and components of a Suf type Fe–S cluster assembly system [37, 38]. There remain over 350 hypothetical “stromal” proteins with obscure functions. In addition to these 500+ proteins, a nuclear-encoded protein that lacks a typical bipartite organellar targeting sequence has been identified to localize to the apicoplast of P. falciparum. This exceptional protein, PfoTPT, is one of the two triosephospate/phosphate transporters (TPT) that mediate moving of phosphorylated C3, C5, and C6 compounds such as phophoenolpyruvate across the apicoplast membranes [39]. It has been proposed that PfoTPT is inserted into the ER membrane by the first of its ten transmembrane domains that functions as the internal signal peptide, before being transferred to the outermost membrane of the apicoplast [39]. The apicoplast-related genes in the nuclear genomes of other Plasmodium spp. including that of oTPT are remarkably similar to those of P. falciparum [40, 41] with the exon/intron structures almost perfectly conserved.

The nuclear genomes of three other plastid-bearing apicomplexan species, T. parva [24], T. annulata [42], and B. bovis [25] have also been sequenced. Apart from unequally expanded gene families, both Theileria nuclear genomes contain an almost identical set of genes [42]. While localization of nuclear-encoded apicoplast proteins has not been shown experimentally, manual inspection predicts that 345 of the 4,035 proteins annotated in the nuclear genome of T. parva are targeted to the apicoplast; of these, 69 have functions predictable from orthologs in other organisms [24]. Some apicoplast proteins identified in Plasmodium spp., such as the enzymes for synthesis of fatty acid and heme and other housekeeping proteins such as SufC involved in plastidic Fe–S cluster assembly system, are missing from Theileria spp. These omissions indicate that the function of the apicoplast has been greatly streamlined in these organisms. The percentage of apicoplast-targeted proteins that have been identified in the nuclear genome of B. bovis (47 proteins of total 3,671 = 1.3%) [25] is remarkably lower than in P. falciparum (8.8–10.3%) [37, 43], T. parva (8.6%) [24] or A. thaliana (7.9%) [44]. This suggests that the B. bovis genome encodes more apicoplast-targeted proteins than those detected by the prediction algorithms [25]. Indeed, an almost complete set of orthologs of putative apicoplast-targeted proteins of T. parva seem to be present in the nuclear genome, though most of them have not been predicted as apicoplast targeted [25]. Sequencing projects for two coccidian species T. gondii and E. tenella are still going on and putative nuclear-encoded ‘apicoplast-targeted proteins’ are listed in the partial sequence data [45]. Apicoplast localization has been confirmed for only a few of these proteins to date.

Non-housekeeping functions of the apicoplast

Apicoplasts are presumed to contribute to the metabolism of the cells that maintain them. Catalogues of nuclear-encoded enzymes predict that the apicoplast of P. falciparum is involved in isoprenoid biosynthesis via the DOXP/MEP pathway, fatty acid synthesis with the type II fatty acid synthase, and heme biosynthesis in collaboration with the mitochondrion [37, 43, 46, 47]. Similar predictions have been made for coccidian parasites such as T. gondii [45]. By contrast, the apicoplasts of piroplasmids such as T. parva [24] and B. bovis [25] are unlikely to contribute to fatty acid or heme biosynthesis. Studies using specific inhibitors have been inclined to suggest the importance of some apicoplast metabolisms, but they are not conclusive by themselves as predicted apicoplast metabolisms might be redundant. For example, triclosan, the specific inhibitor of enoyl-ACP-reductase in the type II fatty acid synthesis system, strongly suppressed the growth of P. falciparum [48, 49], though the drug also strongly inhibited the growth of T. parva that lacks enoyl-ACP-reductase [50]. This suggests there is a yet-to-be-identified target of triclosan in the organism, and perhaps in other apicomplexans as well. Another example is fosmidomycin, the specific inhibitor of DOXP reductoisomerase involved in apicoplast isoprenoid biosynthesis. Fosmidomycin strongly inhibits the growth of P. falciparum [51] but has little inhibitory effect on T. gondii [52] and T. parva [50]. Perhaps subtle differences in the structure of the target enzyme DOXP reductoisomerase or the accessibility of the inhibitor to the enzyme are responsible for this difference [52]. Alternatively, the importance of the isoprenoid metabolism may depend on species. Recently, Brooks et al. [53] made a T. gondii strain in which the apicoplast membrane-localized phosphate translocator (TgAPT) gene was knocked out conditionally. Analysis of the phenotype suggested that isoprenoid biosynthesis in the apicoplast is essential for parasites survival. A similar genetic approach should be applied to establish the importance of individual metabolisms hypothetically attributed to the apicoplast of each species.

Housekeeping functions of the apicoplast

Maintenance of the apicoplast DNA

The blood-stage P. falciparum contains not more than three copies of organellar genome prior to replication; this increases by more than ten times to distribute apicoplast DNA to each daughter cell in schizogony [26]. Two different mechanisms are involved in the replication of the DNA. One uses twin D-loops in the IR region, forming a θ-form intermediate and is highly sensitive to the DNA topoisomerase inhibitor ciprofloxacin [26, 54, 55]. The other, less drug sensitive mechanism probably involves rolling circles likely to initiate outside the IR, but has not been characterized.

Electron microscopic analysis of apicoplast DNA prepared from T. gondii tachyzoites found replicating molecules in lariat rather than θ-form replication intermediates [27]. Compared to blood-stage P. falciparum, T. gondii tachyzoites contain much more apicoplast DNA: at least 25 units of the apicoplast genome per cell [56]. In addition, organellar division occurs much more frequently in T. gondii tachyzoites than in blood-stage P. falciparum. Such factors might account for the different mode of DNA replication. As mentioned earlier, the apicoplast DNAs of T. parva and B. bovis lack the IR, all genes including those of rRNAs and tRNAs being arranged on the same DNA strand. These features suggest that piroplasmid apicoplast DNAs might replicate by a rolling circle mechanism like T. gondii, though experimental evidence is unavailable.

The DNA polymerase responsible for DNA replication in the apicoplast has yet to be determined. Nonetheless, one nuclear-encoded enzyme predicted to be involved has been characterized and named the plastidic DNA replication/repair enzyme complex (Prex). An apicoplast targeting sequence precedes a multifunctional polypeptide that comprises a primase–helicase domain resembling the Twinkle helicase of mammalian mitochondria, and an exonuclease–polymerase domain like DNA polymerase I of Aquifex aeolicus [57]. An ortholog occurs in the nuclear genome of every plastid-bearing apicomplexan species examined, but not in the genomes of either plastid-lacking Cryptosporidium spp. or non-apicomplexan species. This implies that Prex is important for the maintenance of apicoplast DNA.

Like the bacteria from which it evolved, the plastid of some algae such as Cyanidioschyzon merolae have an HU protein that assembles organellar DNA into the structure known as the nucleoid [58]. An HU ortholog with an apicoplast targeting signal is encoded in the nuclear genome of each plastid-bearing apicomplexan species examined to date. In the case of P. falciparum, HU localizes in the apicoplast and binds the apicoplast DNA in a sequence-independent manner [59, 60]. PfHU is distributed throughout the interior of the apicoplast of P. falciparum [60], agreeing with the distribution of the apicoplast DNA suggested by electron microscopy [17] and visualization with DNA-binding fluorescent dyes [61]. Likewise, Matsuzaki et al. [56] and Köhler [18] mentioned that apicoplast DNA permeates the entire apicoplast of T. gondii. Together these observations imply that the apicoplast in general has a rather-spread nucleoid whose size is equal or very close to the size of the organelle itself, though the organelle may have a compact nucleoid like those of other plastids, at certain developmental stages of the cell.

Transcription and splicing

All apicoplast genomes so far analyzed encode genes specifying the β, β′ and β″ subunits of the bacteria-type RNA polymerase (rpoB, rpoC1, rpoC2). By analogy, these subunits should form a complex with a dimer of the α subunit [62]. Unlike other plastid genomes, the apicoplast genome lacks the gene for the α subunit (rpoA); instead the nuclear genome of each plastid-bearing apicomplexan encodes two different rpoA genes. The α subunit of other organisms is comprised of two conserved domains: the N terminal domain required for assembly of the RNA polymerase complex and basal transcription [63, 64] and the C terminal domain that interacts with transcription activators recognizing the upstream promoter element [65]. Both α subunit proteins encoded by the apicomplexan nuclear genome have an predicted apicoplast targeting sequence and a well conserved N terminal domain, but neither contains sequence corresponding the C terminal domain conserved among other α subunit proteins. Furthermore, the gene specifying the σ subunit, which binds to the promoter element and promotes transcription of the gene by the bacteria-type RNA polymerase [66, 67], seems to be missing from the genomes of apicomplexan species, as are the conserved −35 and –10 elements from the 5′ sequence upstream of the apicoplast genes. These peculiarities suggest that transcription in the apicoplast is regulated differently from bacteria or other plastids.

Transcription in the mitochondrion is governed by a nuclear-encoded RNA polymerase distantly related to the RNA polymerase of T7 bacteriophage [68]. A similar nuclear-encoded T7-like RNA polymerase (NEP) is present in plastids and involved in transcription of genes in the organelle [69]. The apicomplexan nuclear genome encodes one gene for a T7-like RNA polymerase which has been annotated as a mitochondrion-specific RNA polymerase [70]. In chloroplasts, the plastid-encoded bacteria-type RNA polymerase is believed to transcribe photosynthesis-related genes whereas NEP is used for all others [69, 71]. As the apicoplast genome lacks photosynthetic genes, importing NEP for transcription could be more beneficial than keeping genes of bacteria-type RNA polymerase subunits. Consequently, the nuclear-encoded T7-like RNA polymerase might be targeted to the apicoplast and function like NEP, though no experimental evidence is available.

Like those of cyanobacteria and plastids that have a group-I type self-splicing intron [72], the trnL(UAA) gene in the apicoplast genome has an intervening sequence at the corresponding position. By analogy, the insert has been predicted as a self-splicing intron, though there is no experimental proof. Unlike other apicomplexans, the apicoplast trnL(UAA) gene of E. tenella lacks the insertion, and it seems likely that it was lost specifically in the Eimeria lineage after it evolved away from the one leading to Toxoplasma. A gene for a potential reverse transcriptase associated with retrotransposons and retroviruses (XP_001238615) is annotated in the nuclear genome of E. tenella [73], whereas retroelements have not been annotated in either Plasmodium or Toxoplasma [74]. Whilst there is no evidence that the reverse transcriptase is expressed in E. tenella and targeted to the apicoplast, the absence of the potential group-I intron from the trnL(UAA) gene in this particular coccidian might have something to do with the presence of the nuclear-encoded reverse transcriptase.

Translation

The 5S rRNA is a component of the ribosome in almost all living organisms [75]. In general, plastid genomes contain the gene for 5S rRNA (rrf) immediately downstream rrl. The exceptions so far known are dinoflagellates with the minicircle-type plastid genome [76], Chromera velia [77] and apicomplexans. Mammalian mitochondrial genomes lack rrf and a structural model suggested the mitochondrial rRNA lack a 5S rRNA-binding domain [78]. However, it has been reported that the organelle imports nuclear-encoded 5S rRNA from the cytosol [79, 80]. Whether the imported 5S rRNA is essential for the function of the ribosomes in mammalian mitochondria is still unknown [81]. Perhaps the apicoplast imports the 5S rRNA from the cytosol like mammalian mitochondria. Otherwise, there might be an rrf whose sequence is too divergent from others to be recognized. Or, the apicoplast may be a very exceptional plastid whose ribosomes are independent of 5S rRNA.

Rpl11 (L11) is a ribosomal protein that together with other ribosomal proteins—S4, L6 and L14 and the stalk proteins L10 and L7/L12—forms the “factor binding site” in prokaryote type ribosomes, including those in plastids [82]. The apicoplast genome encodes S4, L6 and L14 proteins and the nuclear genomes of P. falciparum, T. parva and B. bovis contain genes for L10 and L7/L12, which are probably targeted to the apicoplast [24, 25, 43]. One rpl11 gene has been annotated in the apicoplast genomes of T. gondii and E. tenella, and the product of the ORF129 gene in the P. falciparum apicoplast genome seems to be related to coccidian L11 [11]. These data imply that ribosomes in the apicoplast of these species probably have the factor-binding site like other prokaryote ribosomes. However, no gene in the T. parva or the B. bovis apicoplast genome seems to specify a product related to L11. This might be because piroplasmid rpl11 genes are too divergent to be identified by a simple sequence similarity search, or their ribosomes might not require the L11 protein.

It is intriguing to see in frame UGA/UAA codons in some “genes” in the apicoplast genome of coccidians. Strictly, there is no direct evidence that these genes actually express proteins. However, the deduced amino acid sequences resemble those of Plasmodium and piroplasmids, especially when each UGA is regarded as a tryptophan codon, like those in bacteria with extremely A + T rich genomes [83]. This implies that the coccidian genes are not pseudogenes but encode expressed proteins. As in bacteria, translation termination in plastids relies on two different types of peptide releasing factors, RF1 and RF2. Both RFs bind specific nonsense codons—RF1 to UAA and UAG, whereas RF2 binds to UAA and UGA—and promote the release of the ribosome from mRNA [84, 85]. As the apicoplast genome lacks a gene for either RF, correct termination of translation in the organelle depends on imported RFs. Unlike those of Plasmodium and piroplasmids, the apicoplast genome of Eimeria and Toxoplasma probably contains no gene with UGA as the termination codon. Perhaps the coccidian apicoplast is free from a functional RF2 on which the apicoplast genomes of Plasmodium and piroplasmids apparently depend.

Unlike UGA, UAA is the codon preferentially chosen for the translation terminator by many genes in the apicoplast genome. The apicoplast genomes of Eimeria and Toxoplasma encode the β″ subunit of prokaryote-type RNA polymerase by the rpoC2 gene that contains one in frame UAA codon. By contrast, the rpoC2 in the P. falciparum apicoplast genome requires a frame shift to produce the right translation product [9]. The coccidian UAA codons and the P. falciparum frame shift occur in the same region of poorly conserved sequence. These data could suggest that the multi-subunit RNA polymerase of the apicoplast is unique, the β″ subunit being split into two halves. Indeed, the apicoplast genomes of T. parva and B. bovis also have split rpoC2 genes and the N and C terminal halves have been ascribed to separate genes [24, 25]. Besides rpoC2, there is another in frame UAA codon in the rps8 gene of the T. gondii apicoplast genome. Because the amino acid sequence specified by this gene is conserved between different apicomplexan species, the gene probably express the polypeptide, though to what amino acid residue the UAA is translated cannot be predicted from data available.

Import and maturation of organellar proteins

In plants, nuclear-encoded plastid proteins have a transit peptide at the N terminus that is cleaved from the matured form of the polypeptide by a stromal-processing peptidase (SPP) [86]. Like other secondary plastids, apicoplasts import nuclear-encoded proteins that have a signal peptide preceding the transit peptide [34]. The signal peptide is removed by the signal peptidase as the proteins are co-translationally released into the ER lumen and the resulting intermediate form with the exposed transit peptide is targeted to the stroma of the apicoplast [87]. Each membrane-crossing step is probably mediated by specific translocons such as ERDA/Der1 [88, 89]. Once the protein reaches the apicoplast stroma, the transit peptide is cleaved off by SPP as in primary plastids and the resulting polypeptide is matured with the help of organellar chaperones.

The gene for a putative SPP is found in the nuclear genome of each plastid-bearing apicomplexan species but is absent from Cryptosporidium. Although the SPP gene in Theileria encodes a protein with an apparent apicoplast targeting sequence [24, 42], those in P. falciparum and T. gondii appear to lack the bipartite organellar targeting sequence. Instead, the SPP gene of both species is preceded by the gene for apicoplast-targeted porphobilinogen synthase (PBGS) [46, 90], and exons encoding the organellar targeting sequence of PBGS are utilized to synthesize the SPP mRNA by alternative splicing [91, 92]. This unique micro-syntheny occurs even in the nuclear genome of Babesia spp., though it is absent from T. parva and T. annulata [24, 42]. One explanation is that the unique gene expression system for SPP/PBGS is an ancient feature—probably established in a common ancestor of Apicomplexa—and has been kept in each plastid-bearing lineage thereafter, except for Theileria.

Protein maturation in organelles requires the type I chaperonin system that originated from the bacterial GroESL system [93]. Like proteobacteria, the mitochondrion has a system comprised of one species each of Cpn60 and Cpn10. By contrast, the plastid, which shares the same origin as cyanobacteria, contains a chaperonin system generally made of two different Cpn60 subunits and Cpn20 [94, 95]. In addition, plant plastids have a Cpn10 that is supposed to localize in the thylakoid lumen [96]. The nuclear genome of each plastid-bearing apicomplexan species encodes two Cpn60, one Cpn10 and one Cpn20 [97]. It has been shown in P. falciparum that one of the two nuclear-encoded Cpn60 proteins is targeted specifically to the apicoplast along with Cpn20, while the other Cpn60 and Cpn10 are localized exclusively to the mitochondrion [9799]. The type I chaperonin system that involves Cpn20 and only one species of Cpn60 is unique to the apicoplast. The lack of Cpn10 is probably because the organelle has no thylakoid. The nuclear genome of the plastid-less Cryptosporidium spp. encodes orthologs of mitochondrial Cpn60/Cpn10 but lacks genes encoding orthologs of plastidic Cpn60/Cpn20, supporting the strict organelle-specific localization of the two type I chaperonin systems of plastid-bearing apicomplexan species.

All apicoplast genomes so far analyzed contain the clpC gene that specifies a member of the HSP100/Clp chaperone family. These are ATP-dependent protein unfoldases belonging to the AAA+ family, and ClpC belongs to a subgroup with two nucleotide binding domains (NTD) along with ClpA and ClpB [100]. The phylogenetically related Clp proteins of cyanobacteria and plastids are essential for normal growth [101103]. In photobionts, ClpC forms a complex with ClpP peptidase (and related ClpR protein) in an ATP-dependent manner [104, 105], and the ClpCP(R) complex proteolyses mistargeted substrates. In addition, it has been presumed that ClpC bound to Tic110 maintains the solubility of proteins imported to plant plastids prior to transfer to other chaperones [106] and provides the driving force for complete translocation into the stroma [107]. A similar intrinsic chaperone activity preventing aggregation of unfolded polypeptides and resolubilizing and refolding aggregated proteins into their native structures, has been reported for cyanobacterial ClpC [108].

The nuclear genome of apicomplexan species with the apicoplast encodes genes for ClpP and ClpR with the apicoplast targeting sequence. Like those of other plastids, these proteins are supposed to form a ClpCPR protease complex in the apicoplast. However, ClpC specified by the gene in the apicoplast genome has only one NTD in the molecule, unlike other plastidic ClpCs. Nevertheless, the apicomplexan nuclear genome encodes several HSP100/Clp family proteins with two NTDs, and some of them have a putative apicoplast targeting sequence. Perhaps one of these nuclear-encoded Clp proteins form the ClpCPR protease complex in the apicoplast whereas others including the apicoplast-encoded ClpC with only one NTD are required for different functions. It is known that ClpB, a paralog to ClpC, and Hsp70 (DnaK) comprise a bi-chaperone system that is important for disaggregation and refolding of intracellular protein aggregates [109]. Like cyanobacteria, plastids of other organisms in general have both ClpB and Hsp70 whereas the apicoplast seems to lack Hsp70. Perhaps, the apicoplast-encoded ClpC is an unusual ClpB that has evolved to compensate the lack of Hsp70. Alternatively, the protein might directly acts as a substitute for the missing Hsp70.

The sufB gene, formerly named ycf24, is found in the organellar genomes of Plasmodium and coccidian species whereas it is missing in piroplasmids. In E. coli, sufB is tightly linked with other genes—sufA, sufC, sufD, sufE and sufS—forming an operon whose transcription is induced during exposure to hydrogen peroxide [110] and iron starvation [111]. SufS is a cysteine desulphurase that is involved in Fe–S cluster assembly and SufE enhances the function of SufS [112, 113]. SufB forms a ternary complex with SufC and SufD [114] and stimulates the function of SufE on SufS [115, 116]. Plastids have inherited Suf proteins form their bacterial ancestor [117], though the genes have mostly transferred to the nuclear genome. No Suf genes remain in the plastid genome of Viridiplantae, but a sufBsufC cluster is generally found on the genome of red plastids. The apicoplast genome with sufB appears to be intermediate as it lacks sufC; the gene is encoded by the nuclear genome together with other Suf genes.

Like Plasmodium and coccidians, piroplasmids have several apicoplast proteins requiring Fe–S clusters. But they lack sufB, sufC, sufD and sufE genes, while sufS gene is encoded in the nuclear genome. One explanation is that the piroplasmid apicoplast acquires Fe–S clusters from outside the organelle importing them by an unknown mechanism and that SufS is required only for other metabolisms such as tRNA modification. Alternatively, the organelle might have a simplified Suf system that would resemble the mitochondrial Isc system where an Hsp70 takes the place of the SufBCD complex [118]. Perhaps the ClpC protein encoded by the duplicated apicoplast genes participates in the system, acting as the functional substitute for the Hsp70.

Conclusion and remarks

The lineages of Plasmodium and coccidians such as Toxoplasma and Eimeria are estimated to have diverged from each other about 350–824 million years ago, predating the divergence of Plasmodium and piroplasmids [74, 119, 120]. The remarkable syntheny of the entire apicoplast genome between Plasmodium and coccidians is unlikely to have arisen secondarily by convergence. Probably the differently organized piroplasmid genomes were generated by later rearrangements. Despite these changes, the gene content of the piroplasmid apicoplast genomes is almost the same as Plasmodium and coccidians, suggesting the present gene content is the minimum acceptable for these secondary plastids.

All apicoplast genomes so far analyzed contain clpC in addition to genes involved in transcription/translation. This fact could imply that the apicoplast genome exists in order to express ClpC. Perhaps the apicoplast-encoded ClpC is the functional substitute for the Hsp70 missing from the stroma of the organelle. The apicoplast genome of piroplasmids contains duplicated clpC genes but lacks sufB present in the Plasmodium and the coccidian apicoplast genomes. The mitochondrial Fe–S cluster assembly system (the Isc system) involves an Hsp70-family protein in place of the SufBCD complex of the plastidic Suf system [114, 118]. Again, one of the duplicated ClpCs might function as substitute for Hsp70, compensating the absence of the SufBCD complex in the Suf type Fe–S cluster assembly in the piroplasmid apicoplast.

The most distant ancestor of the apicoplast must have been a photosynthetic primary plastid. However, the immediate origin of the apicoplast acquired by the ancestor of Apicomplexa and other related protists is unknown. The plastid could have been primary or secondary, but it must have come from an alga with a plastid whose organellar genome encoded SufB. Recently, the plastid genome was reported for a previously undescribed alga CCMP3155, a photosynthetic alveolate that is phylogenetically close to the Apicomplexa [77]. This plastid genome contains all the genes in the apicoplast genome, and the order of genes in the apicoplast genome can be reconstructed from the CCMP3155 plastid genome with only a small number of hypothetical rearrangements. These data imply the Apicomplexa and CCMP3155 are close related, though not necessarily in direct line.

When the plastid-less ancestor of Apicomplexa acquired its symbiotic alga as secondary plastid, some metabolic pathways in the organism must have been duplicated. Competition between duplicated pathways would have ensued to reduce the redundancy. Heme synthesis was one such pathway. In general, plastid-bearing organisms synthesize heme exclusively in the plastid using a characteristic metabolic pathway also used for chlorophyll synthesis. By contrast, plastid-less organisms have a distinct heme pathway that involves the mitochondrion and the cytosol. Probably because the plastid pathway is sufficient to supply all required heme, organisms with a photosynthetic secondary plastid, such as diatoms, maintain the complete plastid-localized heme pathway, losing the non-plastid pathway (Fig. 2). Like other plastid-bearing organisms, apicomplexans such as P. falciparum have an algal-type PBGS that localizes in the apicoplast [46, 90]. However, they lack algal enzymes required for synthesizing substrate for PBGS in the plastid [121] and have ALA synthase, which is found exclusively in the plastid-less organisms, in the mitochondrion instead [46]. This fact might suggest that chlorophyll synthesis in the ancestral apicoplast in the ancient Apicomplexa in which the selection took place had already become dispensable. In other words, the alga from which the apicoplast originated could have been non-photosynthetic, contrary to the commonly held belief. Even so, the apicomplexan ancestor must have benefited from the acquired plastid as it would be a useful source of various metabolites even when non-photosynthetic [122]. Why then has the plastidic PBGS been maintained instead of the non-plastidic one? One possible explanation is that the algal PBGS gene could not be lost because it was inserted by chance within the SPP gene to make the unique PBGS/SPP gene cluster.

Fig. 2
figure 2

Evolution of heme biosynthesis in organisms with secondary plastids (hypothesis). The plastid-less ancestor had a non-plastid (N) type heme pathway whereas the algal endosymbiont, which donated the secondary plastid, had a distinct pathway in the plastid (P). Algal genes were transferred to the host nuclear genome one by one. The ancestral organism with a secondary plastid initially had both P- and N-type pathways, but subsequently, one of the two was selected and the other wiped out as redundant. Because chlorophyll synthesis depends on the P-type heme pathway, photosynthetic organisms chose the P-type, throwing the N-type away. By contrast, organisms that were not dependent on photosynthesis kept the N-type; this was often accompanied by loss of the plastid. The apicomplexan ancestor was non-photosynthetic and would lose the P-type. However, the gene complex of PBGS and SPP in the nuclear genome made the loss incomplete, giving rise to a unique P/N hybrid heme pathway (thick line). Thereafter some apicomplexans lost the entire pathway

The proteome of the apicoplast of piroplasmids has shrunk considerably compared to those of Plasmodium and coccidians. This is probably because piroplasmids have acquired alternative sources of some metabolites that used to be supplied from the apicoplast. Apicoplast DNA is apparently missing from C. parvum [13] and the fact that the nuclear genome of C. parvum lacks genes that are required for the biogenesis of the apicoplast such as Cpn60 and SPP [123] suggests that the species completely lacks the apicoplast as an organelle. Gregarinasina, one of the major apicomplexan subclasses that contains at least 1,600 species [2, 124], has been shown to be closely related to the genus Cryptosporidium. Some if not all Gregarinasina can lack the apicoplast like Cryptosporidium. Indeed, it was reported that G. niphandrodes lacks apicoplast DNA [14].

To investigate the past and future of the apicoplast, the genome of many more apicomplexans should be studied, especially those placed phylogenetically between Plasmodium and T. parva/B. bovis as well as those belonging to Gregarinasina. The analysis of non-apicomplexans closely related to Apicomplexa also is essential. To elucidate the functions of present day apicoplasts, it is critical to establish an isolation method, as biochemical and proteomic analysis of the purified organelle is definitely required.