Possible import routes of proteins into the cyanobacterial endosymbionts/plastids of Paulinella chromatophora
- First Online:
- Cite this article as:
- Mackiewicz, P., Bodył, A. & Gagat, P. Theory Biosci. (2012) 131: 1. doi:10.1007/s12064-011-0147-7
- 874 Downloads
The rhizarian amoeba Paulinella chromatophora harbors two photosynthetically active and deeply integrated cyanobacterial endosymbionts acquired ~60 million years ago. Recent genomic analyses of P. chromatophora have revealed the loss of many essential genes from the endosymbiont’s genome, and have identified more than 30 genes that have been transferred to the host cell’s nucleus through endosymbiotic gene transfer (EGT). This indicates that, similar to classical primary plastids, Paulinella endosymbionts have evolved a transport system to import their nuclear-encoded proteins. To deduce how these proteins are transported, we searched for potential targeting signals in genes for 10 EGT-derived proteins. Our analyses indicate that five proteins carry potential signal peptides, implying they are targeted via the host endomembrane system. One sequence encodes a mitochondrial-like transit peptide, which suggests an import pathway involving a channel protein residing in the outer membrane of the endosymbiont. No N-terminal targeting signals were identified in the four other genes, but their encoded proteins could utilize non-classical targeting signals contained internally or in C-terminal regions. Several amino acids more often found in the Paulinella EGT-derived proteins than in their ancestral set (proteins still encoded in the endosymbiont genome) could constitute such signals. Characteristic features of the EGT-derived proteins are low molecular weight and nearly neutral charge, which both could be adaptations to enhance passage through the peptidoglycan wall present in the intermembrane space of the endosymbiont’s envelope. Our results suggest that Paulinella endosymbionts/plastids have evolved several different import routes, as has been shown in classical primary plastids.
KeywordsPaulinella chromatophora Endosymbiosis Plastid Pre-sequence Targeting signal Endosymbiotic gene transfer
Primary plastid endosymbiosis and Paulinella chromatophora
Available data clearly demonstrate that plastids evolved from free-living cyanobacteria acquired by heterotrophic eukaryotic cells 1–2 billion years ago (Butterfield 2000; Douzery et al. 2004; Yoon et al. 2004; Kutschera and Niklas 2005; Kutschera 2009). This process, called primary endosymbiosis, resulted in plastids surrounded by two membranes. Such primary plastids are characteristic of three eukaryotic lineages: (i) glaucophytes, (ii) red algae, and (iii) green plants, including green algae and their land-plant descendants (Cavalier-Smith 2000; Palmer 2003; Gould et al. 2008; Archibald 2009). Although some authors still consider polyphyly of these photosynthetic groups and their plastids a reasonable hypothesis (Nozaki et al. 2007; Stiller 2007; Howe et al. 2008; Hampl et al. 2009), most now accept a monophyletic origin (Rodríguez-Ezpeleta et al. 2005; Reyes-Prieto et al. 2007; Burki et al. 2008). Consequently, glaucophytes, red algae, and green plants are grouped together in the kingdom of Archaeplastida or Plantae.
This tight host–endosymbiont relationship is especially well demonstrated by substantial reductions of the Paulinella endosymbiont genomes, which have been fully sequenced in two different strains, CCAC 0185 and FK01 (Nowack et al. 2008; Reyes-Prieto et al. 2010). The sizes and coding capacities of both genomes have decreased approximately three fold, down to ~1 Mb and ~900 genes compared to a ~3 Mb genome encoding ~3,500 genes in their closest free-living relative, the cyanobacterium Synechococcus WH5701 (see Nowack et al. 2008; Reyes-Prieto et al. 2010). This drastic genome reduction has been accompanied by the loss of many genes with products involved in essential biosynthetic pathways, such as the synthesis of amino acids (e.g., glutamine, arginine, methionine) and co-factors (e.g., riboflavine, biotin, coenzyme A) (Nowack et al. 2008; Reyes-Prieto et al. 2010). Moreover, and more importantly, individual genes were lost from vital and otherwise intact biosynthetic pathways (e.g., hemD encoding uroporphyrinogen III synthase), gene expression machinery (e.g., ligA encoding NAD-dependent DNA ligase), and subcellular structures (e.g., sulA encoding cell-division inhibitor that blocks FtsZ polymerization). Finally, Paulinella endosymbiont genomes are especially poor in genes coding for solute channels and membrane transporters (Nowack et al. 2008; Reyes-Prieto et al. 2010). All these features strongly suggest that Paulinella endosymbionts import nuclear-encoded proteins in ways similar to other true cell organelles such as classical primary plastids (Bhattacharya and Archibald 2006; Yoon et al. 2006; Bodył et al. 2007; Mackiewicz and Bodył 2010) (Fig. 1).
The best candidates for genes with protein products that are imported into Paulinella endosymbionts/plastids are those transferred from the endosymbiont genome to the host nuclear genome through a process called endosymbiotic gene transfer (EGT) (Timmis et al. 2004; Bock and Timmis 2008) (Fig. 1). Recent Paulinella genome and transcriptome analyses have identified more than 30 nuclear-encoded genes acquired via EGT (Nakayama and Ishida 2009; Reyes-Prieto et al. 2010; Nowack et al. 2011). The actual number of EGT-derived genes is likely much greater, perhaps between 40 and 125 genes as estimated by Nowack et al. (2011), although this still is much lower than the ~1700–2500 genes transferred from the genomes of classical primary plastids to their hosts’ nuclear genomes (for reviews see Bock and Timmis 2008; Kleine et al. 2009). Many of the Paulinella transferred genes are engaged in photosynthesis or photo-acclimation of thylakoid membranes and are transcriptionally regulated by the host cell.
Possible import routes of proteins into Paulinella endosymbionts
Most proteins imported into classical primary plastids carry N-terminal targeting signals known as plastid transit peptides (Bruce 2000, 2001; Lee et al. 2008). These peptides are sufficient for their translocation across the plastid envelope with the help of (i) the translocon at the outer chloroplast membrane (Toc) and (ii) the translocon at the inner chloroplast membrane (Tic) (for reviews see Inaba and Schnell 2008; Jarvis 2008; Agne and Kessler 2009; Benz et al. 2009). Each of these translocons consists of several specialized protein subunits. Toc involves three kinds of such subunits: (i) Toc34, Toc64, and Toc159 function as transit peptide receptors, (ii) Toc75 forms a protein-conducting channel, and (iii) Toc12 is responsible for delivering imported proteins to the Tic translocon. The Tic translocon is composed of (i) Tic20, Tic21, and Tic110 that probably constitute three independent protein-conducting channels, (ii) Tic32, Tic55, and Tic62 that form a redox regulon, (iii) Tic22 that is responsible for the coordination of the Toc and Tic translocons and/or intermembrane space protein targeting, and (iv) the Tic40 co-chaperone that, along with the scaffold-channel Tic110 subunit and the stroma-residing Hsp93 and Hsp70 chaperones, provides a motor machinery to pull imported proteins into the stroma.
Although only three Toc and Tic homologs were found in the Paulinella endosymbiont genome, this does not exclude the possibility that other toc and tic genes were transferred to the host nuclear genome and their encoded proteins now are imported into the endosymbiont to create a Toc-Tic-like protein import apparatus. For example, the Paulinella Tic21, Tic32, and Toc12 homologs could form a translocon in the inner envelope membrane, together with other subunits (e.g., the endosymbiont-encoded Hsp93 and probably the nuclear-encoded Tic20, Tic55, and Tic62), that is very similar to the Tic system of classical primary plastids (Bodył et al. 2010). How proteins could pass across the outer membrane of Paulinella endosymbionts is less clear, however, because no homolog to the Toc75 pore (bacterial Omp85) is found in the endosymbiont’s genome (Bodył et al. 2010). At present, we cannot exclude the possibility that a gene encoding this protein was transferred to the host nuclear genome but has yet to be identified. Nevertheless, alternative ways for protein translocation across the outer membrane of Paulinella endosymbionts/plastids should be considered (Fig. 2).
As mentioned previously, the majority of proteins imported into classical primary plastids carry plastid transit peptides responsible for their Toc-Tic-dependent import, but some are equipped with typical endomembrane signal peptides (for reviews see Bhattacharya et al. 2007; Jarvis 2008; Bodył et al. 2009). It has been demonstrated experimentally that these proteins, specifically α-carbonic anhydrase, nucleotide pyrophosphatase/phosphodiesterase, and α-amylases (represented by αAmy3 and αAmy7), are trafficked to the plastids via the endomembrane system involving either the endoplasmic reticulum (ER) alone or in concert with the Golgi apparatus (Chen et al. 2004; Villarejo et al. 2005; Nanjo et al. 2006; Kitajima et al. 2009). Moreover, Armbruster et al. (2009) estimated that as many as 73 plastid-targeted proteins carry such signal peptides in Arabidopsis thaliana, constituting 5% of the plastid proteome. Therefore, it is reasonable that in P. chromatophora nuclear-encoded, endosymbiont-targeted proteins also could be delivered to the outer endosymbiont membrane in vesicles derived from the host’s endomembrane system. In support of this hypothesis, bioinformatics analyses of the sequence from Paulinella photosynthetic psaE gene, which was transferred to the host nuclear genome in the FK01 strain (Nakayama and Ishida 2009), revealed an upstream sequence with qualities of a typical signal peptide (Mackiewicz and Bodył 2010). Applied programs also showed the gene to encode an unambiguous cleavage site for this hypothesized peptide. These findings imply that the PsaE protein is first translocated into the ER lumen where it is processed, and then it is most likely targeted to the endosymbiont’s outer membrane in vesicles derived from the endomembrane system.
The above studies suggest that protein targeting to Paulinella endosymbionts/plastids proceeds via the endomembrane system, however, other bioinformatics analyses of nine EGT-derived genes from Paulinella CCAC 0185 strain did not yield evidence for a universal targeting signal (Nowack et al. 2011). Only products of two transferred genes showed some signal peptide predictions and in one case, PsaE, a weakly supported mitochondrial transit peptide was suggested. Because these analyses provided somewhat ambiguous results, we reanalyzed these sequences by considering more polypeptide variants based on potential translation initiation sites and applying additional bioinformatics tools that predict different targeting signals. We also performed statistical analyses of the basic properties of these hypothesized proteins, including molecular weights, charges, and amino acid compositions, as well as investigate the origins of their potential targeting signals.
Materials and methods
The sequence of psaE from the P. chromatophora FK01 strain was kindly supplied by Dr. Takuro Nakayama and Dr. Ken-ichiro Ischida (Nakayama and Ishida 2009), whereas other EGT candidates with sequenced 5′ ends from the P. chromatophora CCAC 0185 strain were obtained from Nowack et al. (2011). The set of 867 amino acid sequences encoded in the Paulinella endosymbiont genome was downloaded from Genbank (http://www.ncbi.nlm.nih.gov), and 1,762 sequences of proteins imported into classical primary plastids with annotated transit peptides were extracted from the Uniprot database (http://www.uniprot.org).
Programs applied in this study that predict different kinds of N-terminal targeting signals including: signal peptide or signal anchor (SP/SA), plastid transit peptide (pTP), and mitochondrial transit peptide (mTP)
Programs that distinguish SP, pTP, and mTP
Bannai et al. (2002)
Small et al. (2004)
Petsalaki et al. (2005)
Bodén and Hawkins (2005)
Hoglund et al. (2006)
Emanuelsson et al. (2000)
Programs specializing in the prediction of SP/SA
Gschloessl et al. (2008)
Nielsen and Krogh (1998)
Programs specializing in the prediction of SP
DetecSig in ConPred II
Lao and Shimizu (2001)
Nugent and Jones (2009)
Käll et al. (2004)
Hiller et al. (2004)
Sigcleave in EMBOSS 3.0.0
Rice et al. (2000)
Reczko et al. (2002)
Shen and Chou (2007)
Chou and Shen (2007)
Bendsten et al. (2004)
Gomi et al. (2004)
Fariselli et al. (2003)
Programs specializing in the prediction of pTP
Emanuelsson et al. (1999)
Schein et al. (2001)
Programs specializing in the prediction of mTP
MitoProt II v1.101
Claros and Vincens (1996)
Guda et al. (2004)
Alignments of Paulinella sequences with their 10 top BLAST cyanobacterial homologs found in Genbank database (http://www.ncbi.nlm.nih.gov) were obtained in M-Coffee (Moretti et al. 2007) and prepared in Jalview (Waterhouse et al. 2009). Hydropathy plots of sequences were made assuming Kyte–Doolittle scale (Kyte and Doolittle 1982) and the sliding window length of 11 residues. Molecular weights and charges of proteins were calculated using pepstats from the EMBOSS 3.0.0 package (Rice et al. 2000). The non-parametric U Mann–Whitney test implemented in Statistica software (StatSoft, Inc., 2006) was used to estimate the statistical significance of differences in molecular weight, charge, and amino acid composition of analysed protein sets.
Numbers of amino acid substitutions per site between Paulinella CsoS4A and its 14 closest cyanobacterial homologs were estimated in TREEFINDER (Jobb et al. 2004) as maximum likelihood distances calculated under the best-fit model Dayhoff + Γ(5), whereas in the case of the Paulinella homolog to Synechococcus WH5701_13905, and its ten closest cyanobacterial sequences, the JTT + Γ(5) model was used. The relative numbers of non-synonymous (dN) and synonymous (dS) substitutions for these sequences were calculated according to the modified Nei–Gojobori method, assuming p-distance (Nei and Gojobori 1986) as implemented in MEGA 5.03 (Tamura et al. 2011). Protein domain searches were performed in NCBI CDD database (Marchler-Bauer et al. 2011).
Prediction of targeting signals in Paulinella EGT-derived proteins
Number of algorithms that predict a given targeting signal for the pre-sequences of Paulinella nuclear-encoded endosymbiont-targeted proteins considering all possible translation initiation sites (TIS)
PsaE (FK01 strain)
PsaE (CCAC 0185 strain)
Synechococcus WH5701_13415 homolog
PsaK, 2. copy
Synechococcus WH5701_06721 homolog
Synechococcus WH5701_13905 homolog
Results of these analyses are presented in Table 2. As previously shown (Mackiewicz and Bodył 2010), 90% of algorithms predicted a signal peptide for the longest polypeptide that could be translated from a psaE open reading frame encoded in the nuclear genome of Paulinella FK01. In the case of CCAC 0185 strain, the signal peptide also was recognized confidently in 73% for Paulinella PsbN translation initiation site variants and in 53% for a Paulinella homolog of Synechococcus WH5701_13415; however, the mature polypeptides of the two Paulinella PsaK proteins reached only 40 and 33% predictability for signal peptides. For all these proteins with potential signal peptides, except for PsaE from the FK01 strain, there were only ambiguously predicted hypothetical cleavage sites, and these were scattered within regions of their mature protein sequences. These results suggest that the N-terminal part of these proteins functions as a signal peptide enabling their co-translational translocation into the ER lumen, but that this region is not processed, which resembles the signal anchor (see the next section).
Interestingly, in contrast to the PsaE from Paulinella FK01 strain, the PsaE encoded in the nuclear genome of Paulinella CCAC 0185 strain show no traits of a signal peptide. Its longest hypothetical polypeptide was predicted to have a mitochondrial transit peptide by 60% of the algorithms employed. We also note that the remaining proteins analyzed from the Paulinella CCAC 0185 strain did not show significant predictions of any targeting signals (Table 2).
Origin of potential targeting signals in Paulinella EGT-derived proteins
Similarly, the Paulinella PsbN variant that shows high signal peptide predictability is similar in length to its cyanobacterial homologs and is not equipped with a substantial N-terminal extension (Fig. 3c). The function of the PsbN protein is still unknown. In photosynthetic eukaryotes, PsbN is encoded in plastid genomes. Some data suggest that this protein represents a component of photosystem II (PSII) (Ikeuchi et al. 1995; Zouni et al. 2001) but this has not been confirmed in other studies (Kashino et al. 2002a, b). Nevertheless, it cannot be ruled out that this protein can sometimes bind transiently to PSII (Plöscher et al. 2009). The presence of reliably predicted transmembrane α-helices in the N-termini of PsbN sequences (Fig. 3c), suggests that this protein is anchored in the thylakoid membrane. Because both signal peptides and transmembrane domains are enriched in hydrophobic residues, it is reasonable that the N-terminal region of Paulinella PsbN mimics a signal peptide and can be recognized by the SRP (signal recognition particle) during its targeting to the ER membrane. The SRP would enable co-translational translocation of this protein into the ER lumen and its subsequent transport in ER- or Golgi-derived vesicles to the Paulinella endosymbiont/plastid. However, in contrast to typical signal peptides, the N-terminal region of Paulinella PsbN most likely is not removed and is required to anchor this protein in its subcellular target, the endosymbiont’s thylakoid membrane. Actually, algorithms that predict cleavage sites of signal peptides gave ambiguous results and located them within the mature region of protein sequence, which suggests that the potential signal peptide is not processed. Furthermore, in contrast to most cyanobacterial homologs, Paulinella PsbN lacks several negatively charged, polar, and hydroxylated residues at its N-terminus. Consequently, it has a longer hydrophobic region than cyanobacterial proteins, which extends toward the N-terminus of the sequence (Fig. 4b).
We hypothesize that an adaptation of the transmembrane domain toward signal peptide function occurred in the two Paulinella PsaK proteins as well. PsaK is subunit X of photosystem I (PSI) (Jone et al. 1991; Jordan et al. 2001) and has two transmembrane α-helices responsible for its insertion into the thylakoid membrane (Kjaerulff et al. 1993; Mant et al. 2001; Düring et al. 2007). In higher plants, this protein is equipped with a typical transit peptide responsible for its import via the Toc-Tic supercomplex (Kjaerulff et al. 1993). Interestingly, the two Paulinella PsaK proteins, which are devoid of N-terminal extensions, are further shortened at their N-termini by at least seven amino acid residues compared with their cyanobacterial homologs (Fig. 3d). In some cyanobacterial PsaK sequences this region shows a weaker hydrophobic character than the more proximal main transmembrane domain (Fig. 4c). Consequently, the N-terminal transmembrane α-helix is located closer to the beginning of the sequence in Paulinella PsaK proteins, which could enable the evolution of signal peptide-like domains at their N-termini. Moreover, the N-terminal ends of Paulinella PsaK sequences do not contain positively charged residues that are conserved in cyanobacteria and are poorer in hydroxylated residues than their cyanobacterial homologs.
Molecular weights and charges of Paulinella EGT-derived proteins
It is notable that the majority of Paulinella EGT-derived genes identified to date encode small proteins (Nowack et al. 2011). Moreover, Mackiewicz and Bodył (2010) found that PsaE from Paulinella strain FK01 has the same number of positively and negatively charged residues. They proposed this as an adaptation to the passage of this protein through the negatively charged peptidoglycan wall located in the intermembrane space of the endosymbiont’s envelope (Kies and Kremer 1979). To check how generally representative these properties are for Paulinella EGT-derived proteins (i.e., nuclear-encoded, endosymbiont-targeted proteins), we compared their calculated molecular weights and charges against all proteins encoded in the Paulinella endosymbiont genomes, as well as proteins imported into classical primary plastids by means of transit peptides.
Average and minimal-maximal range of molecular weight and absolute value of charge for three sets of plastid proteins
Set of proteins
Molecular weight (kDa)
Absolute value of charge
Nuclear-encoded proteins targeted to Paulinella endosymbiontsa
Proteins encoded in Paulinella endosymbiont genomes
Nuclear-encoded proteins targeted to primary plastids by means of pTP
The low-molecular weights and almost neutral charges of Paulinella EGT-derived proteins fit well with the properties of proteins that can pass freely through the peptidoglycan wall. In agreement with this hypothesis, Demchick and Koch (1996) demonstrated that globular, uncharged proteins up to 24 kDa in molecular weight pass freely through the isolated unstretched peptidoglycan sacculi of Escherichia coli (Gram-negative bacterium) and Bacillus subtilis (Gram-positive bacterium). Interestingly, the molecular weights of all variants of the Paulinella EGT-derived proteins analyzed are well below that threshold (Table 3), whereas only 38% of Paulinella endosymbiont genome-encoded proteins and 22% of plastid-targeted proteins have weights under 24 kDa.
Amino acid composition of Paulinella EGT-derived proteins
Average and quartile (Q1–Q2) range of amino acid percentages for Paulinella endosymbiont- and nuclear-encoded proteins
Amino acid residue
Nuclear-encoded proteins targeted to endosymbiont
Proteins coded in endosymbiont genome
Bonferroni corrected P value
Targeting signals of Paulinella EGT-derived proteins
Our analyses show that signal peptide-like sequences are the most commonly predicted N-terminal targeting signals in the Paulinella EGT-derived proteins studied, and were identified in five of the ten proteins analyzed (Table 2). This suggests that the signal peptide-carrying proteins are targeted to Paulinella endosymbionts/plastids via the host endomembrane system (Fig. 2) (see also Mackiewicz and Bodył 2010). The results obtained are rather unexpected because, by analogy to classical primary plastids, we might expect proteins imported into Paulinella endosymbionts/plastids to use N-terminal targeting signals resembling plastid transit peptides (Bruce 2000, 2001; Lee et al. 2008). Interestingly, a comparison of Paulinella EGT-derived proteins with their closest cyanobacterial homologs shows that the signal peptide-like sequences have different origins (Fig. 3). The signal peptide of PsaE protein from FK01 strain probably represents a typical cleavable signal peptide (see also Mackiewicz and Bodył 2010) that was added after transfer of this gene to the host nuclear genome. The Paulinella homolog of Synechococcus WH5701_13415 does not have such an extension, but its existing N-terminal sequence has acquired new properties of signal peptides and could play the same role (Table 2; Fig. 4). The N-terminal ends of PsbN and two PsaK proteins contain transmembrane domains that also show features of signal peptides and also could fulfil this function.
Some similarity of N-terminal transmembrane domains to signal peptides has been observed in plastid and mitochondrial outer membrane proteins (Kanaji et al. 2000; Lee et al. 2001, 2004; Horie et al. 2003; Waizenegger et al. 2003; Hofmann and Theg 2005). It was shown that several charged residues adjacent to the transmembrane domain play a crucial role in distinguishing these proteins from those directed to the endomembrane system. For example, experimental replacement of such residues with uncharged glycine, or their complete deletion, caused mistargeting of plastid AtOEP7 and AtToc64 proteins to the endoplasmic reticulum or the plasma membrane (Lee et al. 2001, 2004). Similar mutations that increase the hydrophobicity of the transmembrane domains and decrease the net positive charge within the flanking regions of mitochondrial outer membrane proteins Tom5 and Tom20 (Kanaji et al. 2000; Horie et al. 2003; Waizenegger et al. 2003) result in mistargeting to the endomembrane compartments. Interestingly, we observed similar changes and substitutions in the Paulinella PsbN and PsaK sequences compared with their cyanobacterial homologs (Fig. 3). This strongly suggests that these changes in the Paulinella transmembrane proteins represent the acquisition of signal peptide properties in their N-terminal transmembrane domains.
In contrast to the PsaE protein from Paulinella FK01 strain, its counterpart from the CCAC 0185 strain has a putative mitochondrial transit peptide (Table 2), which implies post-translational import perhaps involving a mitochondrial protein-conducting channel that was relocated to the outer endosymbiont membrane (Fig. 2). Good candidates for such a translocation channel are Tom40 and Tim22. In support of this hypothesis, the outer membrane of higher plant plastids contains the OEP16 channel for protochlorophyllide oxidoreductase A (Reinbothe et al. 2004; Pollmann et al. 2007), which probably evolved from a relocated mitochondrial Tim22 pore (Cavalier-Smith 2006). A third mitochondrial candidate could be the homolog of Omp85 (and, therefore, Toc75), but this gene has been identified to date only in trypanosomatid parasites (Pusnik et al. 2011). At present, we cannot exclude the possibility that the mRNA sequence of psaE obtained from CCAC 0185 strain is incomplete and could yet be found to contain upstream sequence encoding signal peptide properties. The published sequence is not limited by an in-frame stop codon upstream of the mature protein. Moreover, the available N-terminal extensions of the PsaE proteins from two Paulinella strains differ significantly (Fig. 3). This suggests that the PsaE proteins from each of these strains evolved distinct targeting signals when their genes were independently transferred to the hosts’ nuclear genomes. This possibility is supported by substantial differences between these two PaulinellapsaE genes, including different intron positions, intron sequences, and 5′ and 3′ untranslated regions (Nowack et al. 2011).
The two-membrane envelope of Paulinella endosymbionts/plastids and the endomembrane system-mediated targeting of their nuclear-encoded proteins
Paulinella endosymbionts/plastids are surrounded by two membranes. Their inner membrane is certainly derived from the cyanobacterial plasmalemma, but origin of the outer membrane is less clear (Bodył et al. 2010; Mackiewicz and Bodył 2010). Because the cyanobacterial ancestor of Paulinella endosymbionts/plastids was surrounded by two membranes, it could be hypothesized that their outer membrane corresponds directly to the outer negibacterial membrane. In contrast, the cyanobacterial outer membrane could have been lost and replaced entirely by the host phagosomal membrane. It is also important to consider the possibility that the outer membrane of Paulinella endosymbionts/plastids has a chimeric origin, and contains components of both bacterial and eukaryotic membranes. The cyanobacterium initially engulfed by P. chromatophora was undoubtedly surrounded by three membranes, the host phagosomal membrane and the two envelope membranes of the endosymbiont, i.e., its plasma membrane and outer membrane (Bodył et al. 2010; Mackiewicz and Bodył 2010). In the initial stages of the endosymbiosis, it is reasonable that uncoordinated divisions of the cyanobacterium, and the phagosome containing it, resulted in regular escapes of these endosymbionts into the host cytosol. During these escapes the outer cyanobacterial membrane could have acquired some lipids and proteins from the phagosomal membrane, a kind of membrane mutation as termed by Cavalier-Smith (2000). This would have led to a chimeric bacterial–eukaryotic membrane (see also Bodył et al. 2009). The existence of clear signal peptides in Paulinella nuclear-encoded, endosymbiont-targeted proteins is compatible with this scenario.
The above evolutionary scenario is consistent with the process that led to classical primary plastids. As with Paulinella endosymbionts/plastids, they have a cyanobacterial origin and are surrounded by two membranes (Cavalier-Smith 2000; Palmer 2003; Gould et al. 2008; Archibald 2009). It was argued for many years that the outer plastid membrane was derived directly from the cyanobacterial outer membrane (Cavalier-Smith 2000); however, it contains lipids characteristic of negibacterial outer membranes (e.g., galactolipids), as well as those found in eukaryotic phagosomal membranes (e.g., phosphatidylcholine) (see Kilian and Kroth 2003). This discovery inspired the hypothesis that the outer plastid membrane has a chimeric bacterial–eukaryotic origin (Kilian and Kroth 2003; Bodył et al. 2009). The identification of many nuclear-encoded, plastid-targeted proteins with signal peptides in higher plants and green algae (Bhattacharya et al. 2007; Jarvis 2008; Armbruster et al. 2009; Bodył et al. 2009) provides additional strong support for this hypothesis.
Possible import routes of Paulinella proteins without N-terminal targeting signals
In contrast to the proteins discussed above, four Paulinella EGT-derived proteins show no evidence of any kind of N-terminal targeting signals (Table 2). It is possible that the sequences of Paulinella homologs to CsoS4A and Synechococcus WH5701_06721 are extended upstream and, therefore, could encode N-terminal targeting signals; none of their upstream sequences are constrained by in-frame stop codons. In contrast, the homologs to Hli and Synechococcus WH5701_13905 are most likely complete sequences.
The Paulinella Hli (high-light inducible protein), like its homologs, possesses a predicted α-helical transmembrane domain and probably is anchored in the thylakoid membrane (Funk and Vermaas 1999; Montané and Kloppstech 2000; He et al. 2001; Bhaya et al. 2002). The other three proteins do not have such recognizable transmembrane domains and, therefore, could reside in the intermembrane space or in the matrix. This indicates that the four proteins must pass one or two envelope membranes during their import into Paulinella endosymbionts/plastids. In such trafficking, they could be using some targeting signals that escaped detection by the algorithms used in this study, which are specialized for predicting classical N-terminal signals.
One such non-classical targeting signal is represented by the C-terminal cleavable region composed of positively charged residues, which was identified in the mitochondrial matrix-residing DNA helicase Hmi1 (Lee et al. 1999). Targeting signals that are not cleaved also were identified in proteins targeted to the inner membrane of classical primary plastids, including ceQORH (Miras et al. 2002, 2007) and Tic32 (Nada and Soll 2004). Their targeting signals appear to enable import into the plastid stroma from where they can be inserted into the inner membrane. The plastid-targeting signal in Tic32 probably is located in its first ten N-terminal residues, whereas in ceQORH it is encoded in an internal domain of 40 residues that is essential but not sufficient for correct plastid localization because it must act in concert with two adjacent domains required for import (Miras et al. 2007). It was demonstrated that import of ceQORH and Tic32 is mediated by a Toc-independent pathway because their translocation involves neither the Toc159 receptor nor the Toc75 channel (Miras et al. 2007). Interestingly, the gene for Toc75 was not identified in the Paulinella endosymbiont genome (Bodył et al. 2010) and is probably also absent from the host nuclear genome (Nowack et al. 2011). The lack of evidence for the presence of Toc75 and other Toc proteins, as well as the absence of recognizable plastid transit peptides in P. chromatophora, suggest that endosymbiont-directed proteins use an import route similar to the alternative pathways used by transit peptide-devoid proteins imported into higher plant plastids. Proteins with non-canonical signals targeted to classical primary plastids are likely to be more common than previously thought because proteomic studies of the A. thaliana plastid proteome revealed 142 (from 604) proteins without the N-terminal cleavable targeting pre-sequences (Kleffmann et al. 2004).
One very interesting aspect of Paulinella EGT-derived proteins is the nature of proteins located in the mitochondrial intermembrane space (IMS). They also are characterized by low molecular weight (7–15 kDa) and the absence of N-terminal transit peptides (Lutz et al. 2003; Herrmann and Hell 2005; Neupert and Herrmann 2007). Many contain conserved patterns of cysteine (and histidine) residues that enable them to bind cofactors or form disulfide bridges. Their translocation through the outer membrane translocons (Tom) requires them to be folded in the IMS, which is triggered by the acquisition of cofactors or by intramolecular disulfide bridges. However, the majority of Paulinella EGT-derived protein variants studied do not have any cysteine residue and eight have only one such residue, which is insufficient to form a disulfide bridge. These sequences also are histidine poor; ten have only one His residue and each of two translation initiation site variants of PsaK 2 copy has three histidines. The other class of IMS mitochondrial proteins that lacks classical transit peptides requires binding to affinity sites for translocation (Herrmann and Hell 2005; Neupert and Herrmann 2007). The targeting signal identified in a representative of this class, heme lyase, consists of a complex pattern of hydrophilic residues (Diekert et al. 1999). Concentration of such residues can be found in Paulinella EGT candidates lacking N-terminal signals but their importance for targeting should be verified experimentally. Similarly, some targeting information can be carried by several amino acids that are used more frequently in the EGT-derived proteins than in proteins encoded in the Paulinella endosymbiont genome (Table 4). They could, for example, facilitate protein import through some unidentified outer membrane channels (Fig. 2) or be involved in still unknown trafficking mechanisms.
Are some Paulinella nuclear-encoded proteins targeted to the endosymbiont as mRNAs?
There is one more possible route for the product of a nuclear-encoded gene to be expressed in an organelle that does not require any targeting signal in the protein product; it operates at the nucleic acid rather than the protein level. Gómez and Pallás (2010) showed recently that a viroid-derived ncRNA acting as a 5′UTR-end mediates the specific import of mRNA for Green Fluorescent Protein into plastids of the tobacco Nicotiana benthamiana. These results suggest the existence of an alternative transport pathway into plastids, where an ncRNA functions as a key regulatory molecule to control the import of plastid-directed, nuclear genes into this organelle. Such import of transcripts instead of proteins would explain the lack of targeting signals in some Paulinella EGT candidates.
An ongoing process of endosymbiotic gene transfer can explain the absence of targeting signals in some Paulinella nuclear-encoded proteins
Data from the above-discussed peculiar plastid and mitochondrial proteins suggest that the Paulinella EGT-derived proteins that lack recognizable targeting signals still could be imported into these endosymbionts; however, it is possible that some of them are not imported or that their import proceeds with low efficiency. Interestingly, copies of Paulinella homologs to both CsoS4A and Synechococcus WH5701_13905 that were transferred to the nucleus, both of which also still are retained in the endosymbiont’s genome, exhibit very high-substitution rates (Nowack et al. 2011). Estimated average numbers of amino acid substitutions per site between the nuclear copies and closest cyanobacterial sequences are 1.81 and 1.25 for Paulinella homologs to CsoS4A and Synechococcus WH5701_13905, respectively. The numbers for the corresponding endosymbiont-encoded copies are only 0.35 and 0.18 amino acid substitutions per site. Weaker purifying selection on nuclear copies also is evident at the DNA level. They show higher average dN/dS values compared to their cyanobacterial homologs. The ratios for Paulinella nuclear homologs to CsoS4A and Synechococcus WH5701_13905 are 0.40 and 0.50, whereas for the endosymbiont’s counterparts it is 0.20 and 0.15, respectively. In addition, an SH3 protein domain was not detected in the Paulinella nuclear homolog to Synechococcus WH5701_13905 although it was found in the endosymbiont’s copy and all cyanobacterial homologs. Domains typical for CsoS4A, such as ethanolamine utilization protein and carboxysome structural protein domain, were identified in the Paulinella nuclear gene copy but with 3 × 109 times higher E-value and 1.5 times lower bit score than in the endosymbiont and cyanobacteral homologs. These results suggest that the original functions of the Paulinella homologs to CsoS4A and Synechococcus WH5701_13905 could have changed after transfer to the nucleus. It is also possible that the Paulinella nuclear copies have not yet acquired efficient targeting signals and, therefore, the endosymbiont’s copies are still maintained.
Molecular weights and charges of Paulinella EGT-derived proteins
The most distinctive features of the EGT-derived proteins analysed are their low molecular weights and nearly neutral charges, which fit well with the properties of proteins known to cross peptidoglycan walls. The permeability limit of ~24 kDa we compared to Paulinella proteins is based on the peptidoglycan wall of E. coli, which is thinner (2–7 nm) than in cyanobacteria (e.g., 10 nm in the genus Synechococcus) (Hoiczyk and Hansel 2000). In some cases, however, it was found that the peptidoglycan wall can change in thickness locally, which could be also true in P. chromatphora. For example, 75–80% of the E. coli peptidoglycan surface is 2.5 nm thick while the remaining areas are ~7 nm in width (Labischinski et al. 1991). There is also the possibility of local lesions in the peptidoglycan wall, as was discovered at the contact sites of translocation pores in the outer and inner envelope membranes of glaucophyte plastids (Steiner and Löffelhardt 2005); however, if comparable import sites associated with local lesions have not evolved yet in the peptidoglycan wall in Paulinella endosymbionts/plastids, genes coding small proteins will be preferably transferred to the host’s nucleus. This could explain why the mass of nuclear-encoded proteins targeted to Paulinella endosymbionts/plastids (4.4–9.1 kDa) is several times lower than the benchmark limit of 24 kDa from E. coli. Moreover, the size of peptidoglycan pores, with diameters of ~2 nm, is probably more restrictive to protein passage than is the wall’s overall thickness. Interestingly, it was found that pores in the peptidoglycans from Gram-negative and -positive bacteria have similar average sizes and are relatively homogenous in size as well (Demchick and Koch 1996). If we assume that peptidoglycan pores in Paulinella endosymbionts/plastids are the same size as in other eubacteria, then the size limit determined for E. coli proteins should also be valid for Paulinella proteins.
Diffusion of charged molecules also can be influenced by peptidoglycan anionic groups (Steiner and Löffelhardt 2005). Consequently, it was suggested initially that the 1-carboxyl groups of d-glutamic acid of peptidoglycans in walls of glaucophyte plastids are amidated with N-acetylputrescine to lower their overall negative charge and polarity, thereby enhancing the passage of nuclear-encoded proteins into the plastid (Pfanzagl et al. 1996a, b; Pfanzagl and Löffelhardt 1999). Although the glaucophyte plastid-targeted proteins investigated to date turn out to be imported post-translationally, through a Toc-Tic supercomplex that bypasses the peptidoglycan wall (Steiner et al. 2005; Steiner and Löffelhardt 2005), problems associated with protein diffusion through the peptidoglycan wall still could be valid for Paulinella endosymbionts. So far, there is no evidence of a Toc-like translocon penetrating the endosymbiont envelope along with the Tic system (Bodył et al. 2010). Moreover, at least some of Paulinella EGT-derived proteins (with recognized signal peptides) are likely delivered to the endosymbionts in vesicles budding off from the host endomembrane system and are released into the endosymbiont intermembrane space, from where they must cross the peptidoglycan wall (Fig. 2) (Mackiewicz and Bodył 2010). It should be also pointed out that potential post-translational protein translocation through some protein-conducting channels in the outer endosymbiont membrane does not exclude release of the imported proteins directly to the intermembrane space. It is possible that such channels, even if present, have not developed a stable connection with the Tic-like translocons residing in the inner endosymbiont membrane.
We are very grateful to Dr. John W. Stiller for helpful comments and English editing and to two anonymous reviewers for their excellent comments and suggestions, which significantly improved the paper. A. B. is supported by funds from Wrocław University grant BS/1018/2010.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.