Evolution of photosynthetic reaction centers: insights from the structure of the heliobacterial reaction center
- 916 Downloads
The proliferation of phototrophy within early-branching prokaryotes represented a significant step forward in metabolic evolution. All available evidence supports the hypothesis that the photosynthetic reaction center (RC)—the pigment-protein complex in which electromagnetic energy (i.e., photons of visible or near-infrared light) is converted to chemical energy usable by an organism—arose once in Earth’s history. This event took place over 3 billion years ago and the basic architecture of the RC has diversified into the distinct versions that now exist. Using our recent 2.2-Å X-ray crystal structure of the homodimeric photosynthetic RC from heliobacteria, we have performed a robust comparison of all known RC types with available structural data. These comparisons have allowed us to generate hypotheses about structural and functional aspects of the common ancestors of extant RCs and to expand upon existing evolutionary schemes. Since the heliobacterial RC is homodimeric and loosely binds (and reduces) quinones, we support the view that it retains more ancestral features than its homologs from other groups. In the evolutionary scenario we propose, the ancestral RC predating the division between Type I and Type II RCs was homodimeric, loosely bound two mobile quinones, and performed an inefficient disproportionation reaction to reduce quinone to quinol. The changes leading to the diversification into Type I and Type II RCs were separate responses to the need to optimize this reaction: the Type I lineage added a [4Fe–4S] cluster to facilitate double reduction of a quinone, while the Type II lineage heterodimerized and specialized the two cofactor branches, fixing the quinone in the QA site. After the Type I/II split, an ancestor to photosystem I fixed its quinone sites and then heterodimerized to bind PsaC as a new subunit, as responses to rising O2 after the appearance of the oxygen-evolving complex in an ancestor of photosystem II. These pivotal events thus gave rise to the diversity that we observe today.
KeywordsPhotosynthesis Reaction center Evolution of photosynthesis Sequence alignments Structural alignments Heliobacteria
The first bacteria to harness photosynthesis 3.0–3.5 Gya began the process of transforming the Earth’s environment into one capable of sustaining multicellular life through their manipulation of natural metabolic gradients and atmospheric composition, especially after the evolution of the water-splitting reaction (Hohmann-Marriott and Blankenship 2011; Fischer et al. 2016). The protein complex unique to photosynthesis is the reaction center (RC), which catalyzes uphill electron transfer (ET) across a biological membrane using light energy. Light is first absorbed by antenna pigments and the excitation energy is transferred into the core of the RC, which houses ET cofactors. Upon reaching these cofactors, the excitation energy triggers a primary charge separation (CS) event, and an electron is transferred down a potential gradient formed by the chain of ET cofactors to a terminal acceptor (Nelson and Yocum 2006; Blankenship 2014). The CS event is fundamentally important; it is the moment in which the energy contained within a molecular electronic-excited state is converted into the energy of a new, biologically useful redox state. In general, RCs are classified by their terminal electron acceptor within the protein. The terminal acceptor of Type I RCs is a [4Fe–4S] cluster, which has been observed to reduce soluble ferredoxin proteins (Vassiliev et al. 2001). Type II RCs reduce a mobile quinone to quinol, which then diffuses into the surrounding membrane (Diner et al. 1991; Cardona et al. 2012).
Seven extant bacterial phyla contain known photosynthetic representatives: Acidobacteria, Chlorobi, Chloroflexi, Cyanobacteria, Firmicutes, Gemmatimonadetes, and Proteobacteria (Cardona 2015; Fischer et al. 2016). Additionally, an ancient symbiosis between a cyanobacterium and eukaryotic ancestor gave rise to the modern eukaryotic algae and land plant lineages (Keeling 2010). Due to billions of years of evolutionary divergence between photosynthetic organisms and their wide distribution across different branches of the tree of life, the sequences of the core RC polypeptides have also widely diverged (Cardona 2015). Their low sequence homology has made piecing together the evolutionary history of photosynthesis difficult, even though the number of available genomes for photosynthetic organisms has greatly grown in recent years. However, despite the low sequence homology, fully folded RC polypeptides retain a surprising degree of structural similarity (Sadekar et al. 2006), which is advantageous for inferring functional and evolutionary relationships.
Several important questions remain concerning how the RC polypeptides radiated among the various prokaryotic lineages, as well as the sequence of events that lead to the distinction in function between Type I and II RCs. Because the core polypeptides of RCs from different phyla share low sequence identity (< 30%), improving the amount of structural data available for RCs, both in resolution and diversity of phototrophic groups represented, is vital in addressing these evolutionary questions. When sequence identity falls below ~ 25%, colloquially termed the “twilight zone,” evolutionary relationships become too difficult to reliably map by simple primary sequence alignments (Doolittle 1986; Rost 1999). Assigning the relationships between groups of phototrophs and their proteins is also complicated by the lack of consensus regarding the evolutionary relationships between prokaryotic phyla in general (Battistuzzi and Hedges 2009). However, structural similarity can be used to infer evolutionary relationships well into the twilight zone (Rost 1999; Sadekar et al. 2006; Cardona 2015; Khadka et al. 2017).
In this paper, we have focused on the evolutionary relationships between the modern RCs, leaving open the questions regarding the radiation of bacterial lineages. We make use of previous biophysical conclusions in conjunction with new structural comparisons to generate hypotheses about the defining characteristics of two important RC ancestors: the last common ancestor of Type I RCs, and the last common ancestor of all photosynthetic RCs. Through the lens of our mechanistic analysis, we propose an evolutionary scheme to explain the Type I/Type II split and the various heterodimerization events that have occurred since.
The new 2.2-Å HbRC structure from Heliobacterium modesticaldum is representative of all known HbRCs
Recently, the X-ray crystal structure of the Type I RC from Heliobacterium (Hbt.) modesticaldum was solved at 2.2-Å resolution and deposited in the Protein Data Bank (PDB) under code 5V8K (Gisriel et al. 2017). This organism is a member of the family Heliobacteriaceae (commonly called the “heliobacteria”) within the phylum Firmicutes, a family which was first described in 1983 (Gest and Favinger 1983). The heliobacterial RC (HbRC) crystal structure revealed a dimer of the PshA polypeptide with two copies of a novel single-transmembrane helix (TMH) antenna subunit PshX bound at the periphery of PshA, in perfect C2 symmetry. This is the first structure of a homodimeric RC and confirmed the architecture that was first proposed 25 years ago (Liebl et al. 1993). Because homodimeric RCs are believed to be the evolutionary predecessors of heterodimeric RCs (Nitschke and Rutherford 1991; Blankenship 1992), this structure is an important milestone in the study of the evolution of these proteins. A general scheme of the HbRC and the membrane in which it resides is presented in Fig. 1. Its Gram-positive host (Vermaas 1994) lacks internal membranes and invaginations (Miller et al. 1986), so the HbRC lies within the single cell membrane, inside of the peptidoglycan cell wall. This membrane architecture is unlike chloroplast-containing organisms, where proteins involved in photosynthesis are contained in an extensive network of stacked thylakoid membranes (for a review, see Pribil et al. 2014). Upon a light-absorption event, the HbRC transfers an electron through its ET cofactors, referred to as P800 (primary electron donor), ec2 (analogous to the term “accessory” used elsewhere), ec3 (primary electron acceptor, analogous to the term “A0” used elsewhere), and FX (a [4Fe–4S] cluster), across the membrane to reduce a soluble ferredoxin on the acceptor side (Fig. 1) (Fuller et al. 1985; van de Meent et al. 1991; Kleinherenbrink et al. 1994). The now-oxidized HbRC is re-reduced on the donor side by a membrane-anchored cytochrome (cyt), cyt c553 (Prince et al. 1985; Albert et al. 1998).
The complete amino acid sequence of the HbRC core polypeptide, PshA, is known for four heliobacterial species: Hbt. modesticaldum (Sattley et al. 2008), Hbt. gestii (Miyamoto et al. 2006), Heliobacillus (Hba.) mobilis (Xiong et al. 1998), and Heliorestis (Hrs.) convoluta (personal communication from Robert Blankenship, Washington University in St. Louis). To understand the evolutionary implications of the recently solved HbRC structure, it is important to know if the PshA sequence from Hbt. modesticaldum is representative of the other known PshA sequences. The four species named above occupy two of the three major clades within the family Heliobacteriaceae. The third clade is composed of the genus Heliophilum (Hph.); however, no full PshA sequences from this genus are known; only a partial N-terminal sequence of the PshA from Hph. fasciatum has been published thus far, so we do not include it in the comparison of the other PshA sequences (Mix et al. 2004; Oh-oka 2007).
The PshA protein from Hbt. modesticaldum shares more than 80% sequence similarity with each of the other three known PshA sequences, and 88.7% sequence identity on average. We have performed a sequence alignment of all known PshA sequences using Clustal Omega (Supplementary Information, Fig. S1). We have noted the residues comprising the Hbt. modesticaldum RC TMHs, the residues providing axial ligand or hydrogen-bonding (H-bonding) interactions with the ET chain cofactors, and the two cysteines (per PshA) that coordinate the [4Fe–4S] cluster. The sequence alignment shows that all these features are well conserved in the sequences from other PshA polypeptides. These data support the hypothesis that the HbRC structure from Hbt. modesticaldum acts as a good representative for all known heliobacterial species; therefore, we should be able to make informed hypotheses about the evolution of the HbRC with only the single published structure.
Comparison of all known RCs
All known RCs contain the common structural motif of a dimer of five TMHs, referred to here as the “ET domain,” with the ET cofactors sandwiched at the dimer’s interface. The core of all known Type II RCs exists as heterodimers of 5-THM proteins. In the anoxygenic Type II RCs from Chloroflexi, Proteobacteria, and Gemmatimonadetes, these are called the L and M polypeptides (also known as PufL and PufM, respectively) (Bryant and Frigaard 2006; Zeng et al. 2014), while in the oxygen-evolving photosystem II (PSII) from cyanobacteria and eukaryotes, these are called D1 and D2 (also known as PsbA and PsbD, respectively) (Barber 2012). The core of the Type I RC from cyanobacteria and eukaryotes, called photosystem I (PSI), is a heterodimer of the PsaA and PsaB polypeptides (Schubert et al. 1998; Jordan et al. 2001), while the core of the Type I RCs from the anoxygenic Chlorobi (green sulfur bacterial RC; GsbRC), Acidobacteria (chloracidobacterial RC; CabRC), and Firmicutes (heliobacterial RC, HbRC) is a homodimer of the PscA or PshA polypeptides (Rémigy et al. 1999; Bryant and Frigaard 2006; Heinnickel and Golbeck 2007; Garcia Costas et al. 2012). In Type I RCs, an additional 6-TMH domain that binds antenna pigments, referred to here as the “antenna domain,” is fused to the N-terminus of the 5-TMH ET domain. Duplication and diversification of a 6-TMH antenna-binding motif gave rise to the modern CP43 and CP47 (also known as PsbC and PsbB, respectively) proteins that bind antenna pigments and associate with D1 and D2 of PSII, functioning analogously to the antenna domain of the Type I core RC proteins (Schubert et al. 1998; Mix et al. 2005; Cardona 2016a).
All RCs are related to a distant common ancestor that appeared early in the history of life on Earth, well before the radiation of the major bacterial phyla observed today (Olson 1981; Nitschke and Rutherford 1991; Heathcote et al. 2002; Shi et al. 2005; Sadekar et al. 2006; Cardona 2015). As previously mentioned, the low sequence homology between core polypeptides makes reliable sequence alignments difficult to obtain (Rutherford et al. 1996; Cardona 2015), rendering traditional phylogenetic analyses problematic. Despite this, various structural features of RCs are well conserved (Olson 1981; Olson and Blankenship 2004; Sadekar et al. 2006; Fischer et al. 2016). The recent HbRC structure has added a new class for comparison. In this section, we compare the HbRC to other RCs, starting first with structural and functional features, before moving on to protein sequence comparison. We compare the topologies of phylogenetic trees generated from protein sequence information with that of a phenetic dendrogram constructed using structural data only, including for the first time that from the HbRC. The goal of this comparison is to show that a structure-only analysis is complementary to sequence-only analysis. Then, looking through a functional-mechanistic lens, we identify what we believe to be the key ET cofactor that helps us to understand the observed functional differences between the RC types. This sets the stage for our discussion of the trajectory of RC evolution.
Cofactor identities, orientations, and functions
The ET domain of all RCs coordinate the ET cofactors, which can be chlorophylls (Chls), bacteriochlorophylls (BChls), pheophytins (Pheos), bacteriopheophytins (BPheos), quinones (Q), and [4Fe–4S] clusters. Due to the dimeric nature of the RC core proteins, the ET cofactors line up in two “branches” (Figs. 1, 2). In all RCs, a special pair of strongly interacting and closely associated Chls or BChls [(B)Chls], called “P” (usually denoted with a subscript number, which is the wavelength associated with its photobleaching maximum after CS), is the ET cofactor situated nearest to the donor side (Blankenship 2014). Moving toward the interior of the protein, the next ET cofactor is ec2, followed by ec3, and then Q. In Type II RCs, the quinone is the final electron acceptor within the RC, but in Type I RCs, a [4Fe–4S] cluster called FX lies on the other side of Q, on the soluble face of the acceptor side (Nelson and Ben-Shem 2004; Nelson and Yocum 2006; Ohashi et al. 2010).
After energy transfer to the ET cofactors, an initial CS event occurs; a (B)Chl ET cofactor in an electronic excited state becomes a strong reducing agent, enabling it to reduce its neighbor (B)Chl ET cofactor. The initial radical pair is thought to be P+ec2− or ec2+ec3−. Regardless, within a few picoseconds a second ET event occurs (either from ec2− to ec3 or from P to ec2+), resulting in P+ec3−. From the P+ec3− state, more ET reactions can occur as the electron moves down a potential gradient; ec3− reduces Q or FX on the acceptor side, and P+ is re-reduced by a cellular electron donor on the donor side of the membrane (Fig. 1), thus opening the RC to another charge separation event.
The reduction potentials and activity of the various ET cofactors are tuned by their chemical structures and protein environments, and each RC class has optimized the ET reactions for the specific electron donors and acceptors in their host organisms. The chemical identity of the ET cofactor at each site (Fig. 2) follows a distinct pattern:
P is always a dimer of (B)Chls, and ec2 is always a (B)Chl of the same type as P.
The ec3 cofactor is always a Chl a (or close derivative) in Type I RCs and is a demetallated version (Pheo or BPheo) of the P/ec2 pigment in Type II RCs.
The orientation and placement of the ET cofactors (Fig. 2) also follow a distinct pattern:
In Type I RCs, ec2 is more perpendicular to the plane of the membrane; whereas it is more parallel to the plane of the membrane in Type II RCs.
The closest P-to-ec3 center-to-center distance is longer in Type I RCs than it is in Type II RCs.
Finally, the directionality-bias of CS along the two ET cofactor branches also follows a distinct pattern:
In Type I RCs, ET can proceed down either branch of the ET chain toward FX. In homodimeric Type I RCs (i.e., the HbRC, GsbRC, and CabRC), the branches are completely symmetric, so ET would not be expected to be biased toward one path. In the heterodimeric PSI, the ET branches are not completely symmetric; however, both branches support ET.
In Type II RCs (all of which are heterodimeric), ET is strongly biased down the ec2 and ec3 cofactors of one branch (the A-branch, bound by the L or D1 polypeptide). The quinone on the A-branch (QA, bound by the M or D2 polypeptide) is fixed within the RC and performs ET to the mobile quinone on the B-branch (QB, bound by the L or D1 polypeptide). After QB has been reduced to quinol, it diffuses away and is replaced by a fresh quinone. The B-branch ET cofactors (ec2B and ec3B, bound by the M or D2 polypeptide) are not thought to participate in forward ET to the quinones.
We have no convincing arguments with regard to the original evolutionary primacy of Chl/Pheo versus BChl/BPheo; however, this continues to be debated by other groups (Lockhart et al. 1996; Blankenship et al. 2007; Blankenship 2014; Cardona 2016b; Martin et al. 2017). The fact that Chl derivatives precede BChls in extant biosynthetic pathways would imply that Chl/Pheo was used first (Granick 1957, 1965). Conversely, devolution of the BChl synthesis pathway to end at Chl could have occurred to take advantage of Chl’s enhanced oxidation power. We have a slight preference for the original primacy of Chl/Pheo; however, resolving this issue is not essential to the arguments made in this paper, so we will be purposefully ambiguous about it.
The six patterns listed above define the functional differences between the Type I and Type II RC lineages. Some of the factors which determine how the different RCs bind and control their ET cofactors are coded into the primary sequence of each polypeptide and the cofactor biosynthesis pathways present in the host organism. Therefore, the evolutionary trajectory of RC function must be viewed, at least to a certain extent, through the lens of the evolution of the polypeptides. Numerous phylogenetic analyses have been performed on RC polypeptides (Mix et al. 2004; Sadekar et al. 2006; Cardona 2015, 2016a; Fischer et al. 2016; Khadka et al. 2017); however, we feel that it is useful to perform our own analysis to raise some important points about how the results should be interpreted.
ET domain phylogenetic trees and phenetic dendrogram
We first collected the polypeptide sequences for the ET domains of RCs that have structures deposited in the PDB (a list can be found in the Supplementary Information, Table S1). We constructed phylogenetic trees, both entirely sequence-based (we use the term “non-structure-based”) and sequence-based but structurally biased (we use the term “structure-based”; see the “Methods” section). In all 9 non-structure-based phylogenetic trees made from Clustal Omega (Sievers et al. 2011) multiple sequence alignments (MSAs), the PbRC core subunits (L and M) are a monophyletic group, the PSII core subunits (D1 and D2) are a monophyletic group, and the ET domains of PSI (last 5 TMH of PsaA and PsaB) are a monophyletic group (Fig. 3a). The HbRC shares a common ancestor with the LCA of the modern PSI polypeptides and the LCA of all modern Type II RC polypeptides, placing its branching point between the PSI and Type II ancestor. This overall topology is maintained by the 9 non-structure-based ET domain trees that were made by varying models and trimming strategies, with bootstrap values for the divergence events always above 70% before the divergence of a single modern polypeptide (e.g., L, D1).
Placement of the root within any of the monophyletic groups would overcomplicate the tree and require drastically different rates of evolution between disparate groups. Because this is unlikely, the root should be placed within the interior branches, most likely between the red and teal diamonds; this is the current consensus in the field (Fig. 3a) (Olson and Blankenship 2004; Sadekar et al. 2006; Cardona 2015; Khadka et al. 2017). If this is the case, then a single early split defined the ET domain function; one descendent would have been the ancestor to all Type I RCs, and the other descendent the ancestor to all Type II RCs. The overall tree alignment is in good agreement with previously hypothesized evolutionary relationships.
To test whether structure-based sequence alignments produce different phylogenetic results, we produced a second set of phylogenetic trees from the 19 ET domains using structure-based MSAs from the PROMALS3D (Pei et al. 2008) and PDBeFOLD (Krissinel and Henrick 2005) servers. These servers first superimpose all the input structures, then produce a sequence alignment by minimizing pairwise distances between residues from the different peptides. Since RCs are heavily composed of TMHs, this method forces all TMHs into sequence alignment. This alignment method should be useful for protein families (like RCs) in which the sequences have strongly diverged, yet secondary and tertiary structure have not. The structure-based and non-structure-based MSAs from Clustal Omega and PROMALS3D, respectively, are shown in the Supplementary Information, Figs. S2 and S3 with highlighted TMH residues (confirmed from the structures) for comparison. In all 18 phylogenetic trees made from structure-based alignments, a low-confidence topology tree is observed. Similar to the non-structure-based phylogenetic trees, the ET domains cluster near one another for PSII, PbRC, and PSI, with the HbRC sharing a more recent common ancestor with PSI than Type II RCs. However, the splitting pattern of L/M and D1/D2 is unresolved, exhibiting low bootstrap values (Supplementary Information, Fig. S4). Although this low confidence does not refute the evolutionary scheme inferred by the non-structure-based phylogenetic trees, it probably does signify that either (1) the confidence of non-structure-based ET domain phylogenetic trees are artificially inflated because of poor alignment of sequences, especially within TMHs (for example, compare the MSAs in Supplementary Information, Figs. S2 and S3), (2) seemingly homologous sequences now perform different structural roles, or (3) seemingly homologous structural features are cases of convergent evolution. It is likely that all three issues are relevant to RCs. Furthermore, a limitation of our method is the relatively low number of sequences used in the phylogenetic analysis, intentionally restricted to only sequences corresponding to solved three-dimensional structures.
To compare morphological relationships between the 19 ET domain structures without any reliance on sequence, we constructed a matrix of pairwise superposition RMSDs of the crystal structures, which was then used to create a phenetic dendrogram (Fig. 3b). The topology of the dendrogram is essentially identical to that of the non-structure-based phylogenetic trees. In this method, a root is automatically placed at the deepest node in order to calibrate the branching distances to the RMSDs. In Fig. 3b, a root should be placed between the Type I and Type II lineages, matching the simplest root placement shown in Fig. 3a. Therefore, although the non-structure-based phylogenetic tree confidence may be artificially inflated, the overall topology is still supported by structural comparison. An interesting observation to be gleaned from comparing the two trees shown in Fig. 3 is that the structural “rift” between the Type I and II RC groups is actually starker than the sequence comparison would lead one to believe.
What our alignment analysis tells us is that sequence-based analysis, even when biased by structural comparison, is simply not enough to provide a detailed picture of the evolutionary path that RCs took from their last common ancestor to the diversity that we observe today. These methods can only tell us how the RCs are different. But, to fill in the gaps, we must rely on biochemical and biophysical data to reveal functional aspects of the RCs that cannot be easily predicted from sequence or structure. Quite simply, these data can tell us why the RCs are different. Through the recent experimental work on the HbRC from our laboratory and others, outlined in the next section, we have come to the conclusion that the quinone cofactors are the key to understanding why the Type I and Type II RCs differ.
The quinones are the key to understanding the difference between Type I and Type II RCs
Quinones are ubiquitous across the Bacteria and Eukarya domains, and are used by all photosynthetic RCs, as well as by many other proteins in various bioenergetic pathways. In Type II RCs, the two quinones have distinct roles: one (QA) is permanently bound as an intermediate in ET to the other (QB), which is the mobile terminal acceptor (Okamura and Feher 1992; Okamura et al. 2000). Two CS events are required to sequentially provide the two electrons to fully reduce QB; two protons, which originate from the acceptor side, must also be provided to QB during the process. The quinol then exits the QB site and is replaced by a new quinone from the membrane, reinitializing the acceptor side of the RC and increasing the reduction state of the quinone pool (Okamura and Feher 1992; Okamura et al. 2000). In PSI, the phylloquinone (PhQ) molecules are not mobile electron carriers, but are instead tightly bound cofactors serving as intermediates in ET between ec3 and FX. Therefore, use of quinones as mobile electron carriers has long been thought to be one of the defining characteristics of Type II RCs (Blankenship 1992).
Experiments probing the involvement of a quinone in the HbRC ET chain have produced conflicting results (Brok et al. 1986; Trost et al. 1992; Kleinherenbrink et al. 1993; Lin et al. 1995; Brettel et al. 1998; van der Est et al. 1998; Oh-oka 2007; Miyamoto et al. 2008; Kondo et al. 2015; Ferlez et al. 2016). It has been clearly shown that the HbRC does not require quinones for forward ET to the terminal electron acceptor, FX (Kleinherenbrink et al. 1993). The recent HbRC crystal structure revealed that cofactor spacings are slightly different than they are in PSI, which likely explains this (Gisriel et al. 2017). The primary donor (P800) and terminal electron acceptor (FX) are ~ 2.5 Å closer together (center-to-center) than the analogous cofactors in PSI. Additionally, ec2 is moved ~ 2.2 Å closer to FX and ec3 is moved ~ 2.4 Å closer to FX. Thus, FX is closer to the set of 6 cofactors that perform primary ET in the HbRC.
Heliobacteria use menaquinone (MQ) as their sole quinone; approximately 4–5 MQ per RC have been observed in membranes of Hbt. modesticaldum and Hba. mobilis. MQ has been found associated with purified HbRC in variable amounts, depending on preparation conditions (Trost and Blankenship 1989; Sarrou et al. 2012). The purification and crystallization conditions used to produce the recent HbRC structure removed MQ from the HbRC, so MQ was not resolved in the structure (Gisriel et al. 2017). It was noted, however, that an ill-defined area of non-peptide electron density lies near ec3. Although a flat quinone headgroup does not fit into the electron density, an isoprenoid tail approximately the length of that from MQ-9 does. One hypothesis to explain this is that during the purification process, the loosely bound MQ exchanged with an isoprenoid such as geranylgeranyl phosphate (Gisriel et al. 2017). Interestingly, the location of the electron-dense headgroup of this unassigned density does not lie between ec3 and FX. Instead, it lies ~ 5 Å to the periphery of the conjugated macrocycle of ec3. If this is the site where MQ normally binds, then it is more likely to be a site for double reduction of a mobile quinone than one for binding a cofactor serving as an ET intermediate between ec3 and FX.
Light-driven quinone reduction in Hbt. modesticaldum membranes has recently been demonstrated (Kashey et al. 2018), strongly suggesting that the HbRC is responsible for this activity. The data were consistent with a redox cycle between the HbRC and cyt bc complex, mediated by MQ (transporting electrons to the cyt bc) and the membrane-attached cyt c553 (transporting electrons back to the HbRC). Moreover, the inhibitor terbutryn, which binds the QB mobile quinone site of Type II RCs, inhibited the MQ photoreduction activity. If this peripheral site identified in the HbRC structure is indeed a mobile quinone reduction site, then the HbRC exhibits the ability to use both soluble ferredoxins and lipophilic quinones as terminal electron acceptors. A potential mechanism is the following: (1) CS in the presence of FX− results in the P800+ec3−FX− state, (2) ET from ec3− to the nearby MQ results in production of the semiquinone (P800+MQ−FX−), which (3) oxidizes FX− to yield the fully reduced quinol (P800+MQH2). Protonation of the reduced quinone species must also occur twice, but we cannot presently identify a clear protonation channel involving nearby residues or water molecules. The HbRC could thus be the first example of a Type I RC performing both Type I and Type II functions, a feature reminiscent of past hypotheses regarding a “Type 1.5 RC” as the ancestor of all RCs (Nitschke et al. 1996; Allen 2005; Sadekar et al. 2006). Since PSI from the menD1 mutant of Chlamydomonas has been shown to be capable of double-reducing plastoquinone inserted into the PhQ site (McConnell et al. 2011), and the HbRC appears to also be capable of double-reducing quinones, we hypothesize that the LCA of Type I RCs did so as well. If the LCA of Type I RCs did indeed have the ability to terminally reduce quinone to quinol, like modern Type II RCs, then the role of the quinones in the LCA of all RCs may be the key to understanding the Type I/Type II split.
A new proposal for the pathway of RC evolution
Using informed hypotheses and the idea that the role of quinones is key, we propose a new pathway for the evolution of RCs (Fig. 4). It is, of course, impossible to know if our proposed pathway is a faithful recounting of what occurred over 3 Gya. However, we have followed a few guiding principles to be consistent in our hypotheses:
No step should result in a loss of fitness (i.e., a potential future gain tomorrow cannot be used as justification for losing something that works well today).
Given the choice between two possible ancestors, the simpler one should be chosen in the absence of a compelling rationale (i.e., simpler versions usually precede more complex versions).
Any extinct RCs that predated the LCA of all extant RCs will not be considered, and their function(s) will be ignored (i.e., we will not debate the appearance of the first RC, which almost certainly predates the appearance of the LCA of all extant RCs).
We will not conflate host organism evolution with the evolution of the RC polypeptides (i.e., we focus only on RC function here, so as to avoid too much conjecture on ancient lateral gene transfer and taxon radiation events).
In describing this pathway, we will start at the LCA of all extant RCs, which we term the “ancestral reaction center” (ARC). We will propose the functional role and mechanism of the ARC, which leads to the reasons for why the ARC diversified into the Type I and Type II lineages. Then, we consider the Type I and Type II branches separately, providing rationale for the diversity observed today. Consistently throughout, we will view the evolutionary pathway through the lens of cofactor changes and rearrangement, which have led to energetic and redox tuning of the ET chain and thus RC function.
The ARC was a homodimeric RC that reduced quinones
Quinones are found in all known photosynthetic RCs, strongly suggesting that the ARC also used quinones. Reduction of quinones on one side of the membrane and oxidation of quinols on the other side is a major proton-pumping mechanism driving ATP synthesis. The discovery that quinones can be fully reduced and perform mobile exchange by the homodimeric HbRC suggests that mobile quinones were an ancestral feature in the Type I lineage, and thus in the ARC as well. If low-potential electron sources were relatively abundant on the early anoxic earth (e.g., Fe2+, S−, H2), as is currently thought (Fischer et al. 2016), then ATP production by cyclic electron flow (CEF) was likely to be the primary role of the earliest RCs. Therefore, the idea that the ARC evolved early in the history of life as a homodimeric membrane protein whose sole purpose was to perform the light-driven reduction of quinones to quinols, without the use of an iron-sulfur cluster, is a reasonable hypothesis. Cytochromes and quinones, as well as a protein serving the role of the cyt bc complex, would have predated the RC (Furbacher et al. 1996; Dibrova et al. 2013), and their function was to link the electron transport pathways of chemo(litho)trophic metabolism to proton pumping. Thus, the ARC would have fit into this preexistent chain in the first phototrophs to boost ATP production using light energy.
It has been suggested recently that a Type I RC appeared first and Type II RCs evolved from it (Martin et al. 2017). This conclusion was based on three premises, each of which builds upon the previous one: (1) that the organism in which phototrophy first evolved was an obligate autotroph, (2) that the carbon fixation pathway in this organism used low-potential ferredoxins that were reduced with molecular hydrogen (H2) as an electron source, and (3) that use of a Type I RC would have been the best answer to generate reduced ferredoxin and power carbon fixation when using higher potential electron sources (e.g., sulfide). Although this scenario is interesting, the argument that only a Type I RC could have initially been useful is by no means iron-clad. Firstly, if reduced carbon were so scarce that autotrophy was required in the earliest organisms, it is difficult to imagine how the earliest cells could ever have arisen. Thus, it is not inconceivable that the earliest phototrophs made use of abiotically reduced carbon (Joe et al. 1986; Etiope and Sherwood Lollar 2013). Secondly, while the Wood–Ljungdahl (reductive acetyl-CoA) pathway makes sense as the original autotrophic mechanism, because it is the only carbon fixation pathway that conserves energy, it does not generate nearly enough ATP to drive conversion of all the fixed carbon into other necessary biomolecules (Ragsdale and Pierce 2008). Thus, the appearance of a simpler Type II RC in such an organism to generate additional ATP (via CEF) for the synthesis of those necessary biomolecules might still have been of enormous benefit. In other words, the ARC did not necessarily need to generate a low-potential reductant like ferredoxin in order to provide a selective advantage.
If we take the contrary opinion that mobile quinone reduction in Type II RCs is a derived trait, we must ask the question: what would have been the driving force for losing FX in the Type II lineage? Since it appeared, FX has functioned very effectively in all Type I RCs to reduce the ferredoxin pool. Moreover, a CEF system in which ferredoxins are reduced by the RC would result in more proton pumping than one in which quinones are reduced by the RC (Supplementary Information, Fig. S5). Ferredoxins can be used to reduce NAD(P)+, and a NAD(P)H dehydrogenase (i.e., Complex I) can be used to pass the electrons to the quinone pool, resulting in 1–2 additional protons pumped per electron transferred, depending upon the coupling of proton pumping to electron transport in the ancient NAD(P)H dehydrogenase. Thus, if the ARC had an FX cluster and was part of such a CEF system, to trade that for a system in which less ATP was made would represent a significant decrease in fitness, and would therefore have been strongly selected against. The simpler hypothesis, therefore, is that the ARC did not contain FX and the Type I lineage gained it.
If the ARC was indeed homodimeric, lacked an FX cluster, and reduced mobile quinones, we can infer that it faced an inherent limitation in its chemistry, due to its two symmetric ET branches. Following CS and subsequent ET to one of the quinones and re-reduction of P, the unstable semiquinone radical could have been processed in three different ways (Fig. 5):
Exchange of the semiquinone (QH·/Q·−) with a new quinone: Semiquinone disproportionation in the membrane (either spontaneous or catalyzed by an unknown enzyme) could lead to production of a quinol (2 QH· → Q + QH2). Alternatively, the semiquinone could have been re-oxidized in the Qo (QP) site of the cyt bc complex or analog thereof.
A second CS event in which ET proceeds down the same branch: After proton-coupled ET to the semiquinone, along with an additional protonation, the mobile quinone will have been converted to a quinol, which can exchange with a new quinone.
A second excitation event where ET proceeds down the opposite branch: This would lead to production of a semiquinone radical on the other side. Subsequently, a slow proton-coupled radical disproportionation reaction could occur, accompanied by final protonation of the acceptor. The newly formed quinol could then be replaced by a new quinone.
A major problem for all these pathways is the very low pKa of the semiquinone (QH· ⇌ Q·− + H+), which makes Q·− difficult to protonate. This fact, coupled with the very low reduction potential of the semiquinone anion (Q·− + e− → Q2−), led to the suggestion that the second quinone reduction step in an RC must be a proton-coupled ET (Q·− + e− + H+ → QH−) followed by protonation (QH− + H+ → QH2), thus avoiding the unstable intermediates QH· and Q2− (Graige et al. 1996; Okamura et al. 2000). Therefore, Path 1 would not be expected to be very effective, as neither protonation of the semiquinone nor exit of the semiquinone anion from the protein interior to the less polar lipid environment would be favorable. Moreover, Path 2 is unlikely to occur, since the negative charge on the semiquinone anion close to ec3 would disfavor reduction of the nearby ec3 due to electrostatic repulsion. Therefore, excitation of an RC with a semiquinone anion on one branch would likely result in CS on the other branch, as was demonstrated for a mutant PSI that slowed ET from PhQA (Santabarbara et al. 2015). Therefore, double reduction of a quinone by Path 2 would not occur until both quinones were in the semiquinone state. Thus, it seems likely that Path 3, in which disproportionation of the two semiquinones (with concomitant protonation of the acceptor quinone) to form QH− followed by protonation to form QH2, would be the major pathway of quinone reduction in the ARC. If this disproportionation reaction was slow, then CS on both branches would be inhibited, leading to CR. Thus, while the ARC would be able to convert light energy into chemical energy by oxidizing a high-potential electron donor (e.g., cytochrome c) and reducing quinone to quinol with uptake of protons from the cytosolic (N) side, it would do so rather inefficiently. We hypothesize that the Type I/II split represented two different solutions to the same problem: how to improve the efficiency of this reaction.
The appearance of FX in the Type I lineage
The branching point in RC evolution from the ARC to the Type I RC lineage was the acquisition of the FX cluster. We term this new ancestral form “Proto-RC1” (Fig. 4). In our view, this was the key innovation driving the other changes in the Type I lineage. The [4Fe–4S] binding loop would have been positioned near the cytoplasmic side (as it is today in extant Type I RCs), allowing pre-existing cytosolic [4Fe–4S] cluster insertion machinery to interact with the site after membrane insertion of the core polypeptides and dimerization. The homodimeric nature of Proto-RC1 also means that only two (rather than four) mutations resulting in cysteine would be required in the cytoplasmic loop between the second and third TMH of the ET domain. Thus, this evolutionary step would not have been very difficult.
We believe that the main function of FX in the Proto-RC1 was to facilitate quinone reduction. Nature can tune the reduction potential of Fe–S clusters over a wide range (i.e., over 1 V of potential; Bak and Elliott 2014) by changing the environment of the cluster. The two reduction potentials associated with quinone reduction (i.e., the Q/Q·− and the Q·−/QH− couples) can also be tuned by their environment, although the first will always be more negative than the second, since the semiquinone is the least stable species. The reduction potential of ec3 is very low (~ − 1 V), thus it should always be favorable for this cofactor to reduce either Q or FX.
We can envision two possible mechanisms of action for the new FX cluster depending on how the reduction potential of FX is poised. In the first, the reduction potential of FX is below that of the Q/Q·− couple, leading to the production of a semiquinone on each side of the RC after two CS events. Here, FX could serve as a catalyst to accelerate the disproportionation reaction of the two semiquinones. Even if the first ET event (Q·− FX Q·− → Q FX− Q·−) were uphill, the second event (Q FX− Q·− + H+ → Q FX QH−) would be very favorable. A final protonation would render quinol formation irreversible.
In the second mechanism, the reduction potential of FX is above that of the Q/Q·− couple, so the first CS event would lead to reduction of FX. The second CS would result in the P·+Q·−FX− state, and ET from FX to the semiquinone would produce quinol (i.e., the unstable Q·− oxidizes FX−) (Reaction Scheme 1). The latter mechanism is similar to what we have hypothesized for reduction of a mobile quinone in the HbRC (Kashey et al. 2018). As mentioned previously, mutations in the region of the FX cluster could change the potential of the cluster in a relatively simple fashion. Thus, even if the Proto-RC1 used the first mechanism immediately after introduction of FX, transition to use of the second mechanism (Reaction Scheme 1) would not be difficult.
With FX in place on the cytoplasmic side of the RC with a reduction potential above that of the previous cofactors, the electron generated by the first CS event would always end up there. This situation was one that natural selection could exploit, as any soluble cellular electron carriers with a higher reduction potential could be reduced by FX− if they were able to bind to the cytoplasmic face of the RC, even transiently. Indeed, the acceptor side of the HbRC seems to be quite promiscuous, capable of reducing any acceptor with a potential above its own (~ − 500 mV) (Ferlez et al. 2016), including those not found in heliobacteria, such as cyanobacterial flavodoxin (Romberger and Golbeck 2012). In all the lineages with Type I RCs, ferredoxins were likely recruited from the genome to interface with the RC, perhaps binding weakly at first (allowing for quinol production most of the time), then interacting progressively better to accept the majority of the electrons from the RC. The lack of homology between the FA/FB proteins associated with Type I RCs supports this scenario; there is no universal Type I RC FA/FB protein that radiated among prokaryotes along with the RC. Simple BLAST searches indicate that PshB, PscB, and PsaC are all more related to ferredoxins from non-photosynthetic organisms than they are to each other (Supplementary Information, Tables S2, S3, and S4). For example, in heliobacteria, the PshB ferredoxin found to associate with the HbRC is closely related to clostridial and bacteroidial ferredoxins and shows little homology, other than the typical ferredoxin Fe–S cluster-binding motif, with the FA/FB proteins of the green sulfur bacteria (PscB) or cyanobacteria (PsaC). In fact, the gene encoding PshB is not even located within the photosynthetic gene cluster (Heinnickel et al. 2007; Sattley et al. 2008), indicating it was not laterally transferred to the heliobacterial ancestor with pshA.
With the increasing use of ferredoxins, rather than quinones, as electron acceptors from the Proto-RC1, the host organism would gain an advantage in ATP production from the longer ET cycle (Supplementary Information, Fig. S5). Thus, there would have been a selective advantage in optimizing ET to ferredoxins via FX instead of Q. One of the ways in which the RC was modified to do this was to use a Chl a as ec3, which would have produced a larger driving force for ET to FX. The other modification was to change the orientation of ec2 to stretch the 6-(bacterio)chlorin system across the membrane, placing ec3 closer to FX for direct reduction. The change in ec2 orientation would have resulted in changes in the arrangement of the TMHs of the ET domain, with the result that only the prime epimer of the pigment in the P site could be accommodated there. We term the RC that contained all of these changes “LCA-RC1” (Fig. 4), the last common ancestor of all extant Type I RCs. We will go into more detail about the changes in ec3, ec2, and the special pair in later sections.
As noted above, an expanded electron transport cycle that included a Complex I (NADH dehydrogenase) in addition to Complex III (cyt bc) would substantially increase the number of protons pumped per electron transferred (Supplementary Information, Fig. S5). This would provide the evolutionary driving force to switch FX from a cofactor enabling quinone reduction to one whose primary function was to reduce ferredoxins. However, in situations when soluble electron acceptors are in short supply (i.e., when the ferredoxin pool is largely reduced, possibly in high light conditions), then the Type I RC could fall back on quinone reduction as a back-up system to make a smaller amount of ATP via a short cycle involving only Complex III. This seems to be what heliobacteria do now (Kashey et al. 2018).
Heterodimerization events in the Type II lineage
In what led to the Type II lineage, a different mechanism was used to speed the inefficient quinone disproportionation reaction: heterodimerization, which allowed for specialization of the two ET branches. In one branch (A-side), the quinone was immobilized and cut off from a proton source, to convert it from a terminal acceptor into an ET intermediate (QA). In the other branch (B-side), the reduction potentials of the ec2 and ec3 cofactors were lowered to inhibit CR between the semiquinone and oxidized special pair (P+QB− state) (Kirmaier et al. 1985, 1991; Parson et al. 1990; Warshel et al. 1994). The reduction potentials of QA and QB were also tuned to favor ET from the former to the latter. The better stability of the QB·− semiquinone would allow enough time for reduction of P+ and a second CS to occur, leading to successful double reduction of QB to quinol. Note that this last step is the same semiquinone disproportionation reaction (QA·− QB·− + 2 H+ → QA QBH2) described for the ARC, but has now been optimized by specialization of each branch and associated quinone. Additionally, CR becomes less probable because the ET cofactors in the B-branch are inactive and because CR is now in the seconds timescale for P+QA−; even if CS occurred in a RC with an empty QB site, there would be adequate time for a new quinone to arrive via diffusion before CR occurred.
Molecular clock studies have recently shown that the D1/D2 ancestor and the L/M ancestor diverged from each other very early after the origin of photosynthesis, with independent D1/D2 and L/M heterodimerization events occurring relatively soon after the split (Cardona 2016a; Cardona et al. 2017). It is therefore logical that the heterodimerization strategy was used twice, early and convergently, in the lineages resulting in Ancestral PSII (ancestral D1/D2) and Ancestral PbRC (ancestral L/M). A signature event in the split between D1/D2 and L/M seems to be the loss of the ChlD/ChlZ site in the PbRC lineage. As explained later in the section “Recruitment of Antennas,” the driving force for this loss may be the association of the Ancestral PbRC with an LH1-like antenna complex. From the points marked “Ancestral PbRC” on our evolutionary scheme (Fig. 4), few changes need to occur to result in the modern PbRC/CfxRC/GmRC lineage. From Ancestral PSII, three changes are required to result in modern PSII, all relating to the antenna domain (CP43/CP47) and the OEC (both discussed in later sections). These changes must have occurred before 2.0–2.5 Gya to precede the rise of oxygen levels in the atmosphere (Fischer et al. 2016).
Accommodation of new function in Type I RCs by changing the ET cofactors
At this point, we return to the Type I lineage to explain the cofactor identity and orientation differences that distinguish it from the Type II lineage. In our analysis, the conclusion drawn is that the function of the ARC, as well as the identities and orientations of ec3/ec2/P, would most resemble those of modern Type II RCs. We make the argument that with the appearance of FX in the Type I lineage, the ec3, ec2, and P cofactors were each affected (likely in that order) in a logical way to adjust to its presence.
The ec3 cofactor: switching to Chl and moving closer to FX
The Type II RCs across the Proteobacteria, Chloroflexi, Gemmatimonadetes, and Cyanobacteria/Eukaryotes use different quinone species (e.g., ubiquinone, menaquinone, plastoquinone) as the final electron acceptor and have different electron donors (e.g., soluble cyt c, bound tetraheme cytochrome, water via the OEC), necessitating adjustment of the ET cofactors to accommodate the change in the energetic gap between donor and acceptor. In Type II RCs, the energetic level of the ec3 site scales with that of the special pair because it is occupied by the demetallated version, Pheo or BPheo [(B)Pheo], of the special pair pigment. For example, in the PbRC, if the special pair contains BChl a, then the ec3 site will contain BPheo a. Since the reduction potential of a (B)Pheo (Pheo + e− → Pheo·−) is always ~ 300 mV more positive than that of the corresponding (B)Chl (Chl + e− → Chl·−) (Fajer et al. 1975), the (B)Chl·+(B)Pheo·− charge-separated state will be the most thermodynamically favorable one of all the potential (B)Chl-based radical pairs, and initial CS will efficiently produce the P+ec3− state. The next ET cofactor after (B)Pheo is the QA quinone, whose reduction potential will always be higher than any (B)Pheo. Thus, ET from (B)Pheo to QA, whatever its chemical identity, will always be favorable. An exception to the above rule is found in PSII from the Chl d-producing cyanobacterium Acaryochloris marina, which uses Chl d in the special pair but retains Pheo a in the ec3 site (Chen et al. 2005; Tomo et al. 2007). Mutants of PbRCs have been produced in which the BPheo is replaced by BChl (the resultant BChl site is termed β) (Kirmaier et al. 1995a, b; Pan et al. 2016). When this change occurs, strong energetic mixing results between the P+ec2− and P+β− states, preventing efficient ET to the quinones and increasing the probability of CR. Therefore, using (B)Pheo as ec3 is probably important for maintaining a redox potential higher than ec2 in Type II RCs.
Invariably, every known Type I RC contains a Chl a molecule or Chl a derivative in the ec3 position, regardless of the pigment content of the rest of the RC. Even the green sulfur bacteria, which use millions of BChl c, d, or e molecules per cell to construct their chlorosomes and thousands of BChl a molecules per cell to construct their antennas and RCs, still make and insert two Chl a molecules specifically into the ec3 site of each GsbRC. No reports have thus far been published of Type I RC mutants that successfully replace the Chl a at ec3 with another pigment. In Type I RCs, the energetic level of ec3 is thus fixed at the level of Chl a and the energy of the P+ec3− state will vary as the rest of the pigments (P, ec2) in the RC change.
In our view, the shift to use of a (B)Chl instead of a (B)Pheo in the ec3 site of the Proto-RC1 was driven by use of the FX cluster as the terminal electron acceptor, which would have a much more negative reduction potential than a mobile quinone. Thus, ET from ec3 to FX will be more favorable if a (B)Chl occupies the ec3 site rather than a (B)Pheo. We propose that this was accomplished by introduction of a residue that could serve as an axial ligand in TMH10.
Because Type I RCs incorporate a version of Chl a in the ec3 site, we must now discuss whether or not the Proto-RC1 used chlorins or bacteriochlorins. The reason for incorporating Chl a, as opposed to a BChl, in the ec3 site is likely explained by its particularly low reduction potential. If the ARC originally used chorins, then the Proto-RC1 would have inherited this characteristic; presumably it would have used Chl a as its major pigment, and Pheo a would have been present in the ec3 site. The addition of an axial ligand in TMH 10 would have led to its replacement by a Chl a situated close to FX. In the lineages leading to the HbRC and GsbRC, the presence of a Chl a in the ec3 site may have been selected for strongly enough, in the face of a shift to use of BChl as the major pigment, that it was retained as the ec3 cofactor. This would explain why the heliobacteria, green sulfur bacteria, and chloracidobacteria bother to make a version of Chl a, which they put into no other site than ec3. If instead the ARC originally used bacteriochlorins, then the first change would have been replacement of BPheo with BChl in the ec3 site of Proto-RC1. Afterwards, the selection for a better reductant for FX would have led to the use of a Chl a derivative as the ec3 cofactor. This might explain why the Chl a derivatives used in the HbRC and GsbRC are chemically different and may have distinct biosynthetic pathways (Bryant et al. 2012; Sattley and Swingley 2013). At this point we cannot distinguish between these two scenarios. It is also important to note that it is not obvious which structural features of the ec3 site in the HbRC select for the Chl a derivative.
The ec3 cofactor in Type I RCs accepts a H-bond from a residue in a surface α-helix on the cytoplasmic side. This is a Ser residue in the HbRC and a Tyr in PSI. The position is conserved, and occurs immediately after TMH10. The H-bond is to the 131-keto group, which is part of the conjugated π system. Donation of a H-bond to this functional group should raise ec3’s reduction potential (Chl + e− → Chl·−). Thus, the H-bond to ec3 would make it a poorer reductant of the quinone, but a better oxidant of ec2·−. One might think of this as a way of compensating for the very large shift in ec3’s reduction potential caused by its change in chemical identity. Once ec3 changed from (B)Pheo to (B)Chl, the redox gradient from ec2 to ec3 would have been lessened, so donating a H-bond to the 131-keto group of ec3 would restore some of this gradient. Note that no ec2 cofactor in any RC is H-bonded in this way. It is also possible that the placement of the H-bond donor on the surface helix served to move ec3 closer to the membrane surface, and thereby closer to FX.
Changing the binding orientation of ec2
In all known RCs, the ec2 (B)Chl is the same type as the major (B)Chl of the RC. When the positions of the RC (B)Chls in all the RC structures are compared, the main difference between Type I and II RCs is the orientation of the ec2 cofactor, which is in turn determined by the position of its axial ligand (Fig. 6). In PSII and the PbRC, the surface “P-helix,” located within the loop between TMH 3 and 4, of D1/D2 and L/M provides the axial ligand for the ec2 cofactor. In L/M, this is a His residue that directly coordinates the central Mg of ec2. In D1/D2, the axial ligand is a water molecule that usually interacts with a Thr side chain or Ile backbone carbonyl (for a sequence alignment of this region, see Supplementary Information, Table S5). The residue in the analogous position of Type I RCs equivalent is either a Tyr (in the HbRC) or Trp (in PSI). The Trp residue in PSI does not interact with ec2. In the HbRC structure, Tyr510 provides a H-bond to the 132-ester carbonyl oxygen of ec2. The axial ligand to ec2 in Type I RCs is instead a water molecule H-bonded to a Gln (HbRC) or Asn (PSI) side chain found in TMH 9 (Fig. 6a, b). This position in the Type II RCs is an Ala or Pro, lacking the ability to coordinate either a (B)Chl central Mg or water (Fig. 6c, d).
The different locations of the axial ligand to ec2—from the “side” in Type I RCs or from the “top” in Type II RCs—and the consequent change in ec2 orientation result in a noticeable shift in the positioning of the TMHs to accommodate this change (Fig. 7). This is one of the most easily visualized differences between the two types of RCs, with the ET domain monomers of the Type II RCs having a more severe “twist angle” around each other than those in the Type I RCs. A possible photophysical explanation for the change arises from the need to optimize ET to the final electron acceptor in each RC. In early Type I RC’s that had recently acquired FX, there would have been selective pressure to optimize ET to FX, as mentioned previously. A solution to this would be to stretch the ET cofactors out so that ec3 moves closer to FX, facilitating the final ET step. By changing the axial ligand location of ec2, thus rotating ec2 more perpendicular to the membrane plane, ec2 would be better able to fit in directly between P and ec3, so that the rate (and thereby the efficiency) of formation of P+ec3− would not suffer. This scenario is supported by the measurement of the closest center-to-center distances between P and ec3 (Fig. 2). In the HbRC and PSI, these distances are 2–4 Å longer than in the PbRC and PSII. In our interpretation, the ec3-to-FX distance in Type I RCs is 2–4 Å farther than it originally was in the ARC.
Another advantage of the new orientation of ec2 is that the vector of ec2+ec3− dipole (assuming that this is the initial charge-separated state, as it is thought to be in PSI (Müller et al. 2003, 2010)) would be better aligned to take advantage of the static electric field across the RC. As mentioned above, the use of a (B)Chl in place of a B(Pheo) in the ec3 site would lower the reduction potential of ec3, and thereby raise the energy of the ec2+ec3− state. The perpendicular orientation of ec2 would make the ec2+ec3− dipole more perpendicular to the membrane plane, thus perpendicular to the direction of the static electric field across the membrane, caused by the overall negative charge on the donor side and the positive charge on the acceptor side of the RC (Gunner et al. 1996; Gisriel et al. 2017). This would lower the energy of the ec2+ec3− ion pair.
The use of a Tyr in the P-helix as a H-bond donor to ec2 in the HbRC raises an intriguing possibility for the transition from the Type II to the Type I orientation of ec2. We suggest that in the ARC, a residue in this position (likely a His) was the axial ligand to ec2. After the introduction of a residue in TMH9 that could serve as axial ligand, the change in axial ligand was facilitated by conservation of the interaction with the P-helix residue, although its character was changed: from axial ligand of the central Mg(II) to H-bond donor to a peripheral functional group (carbonyl O of the methyl ester substituent on ring E). Note that this carbonyl group is not part of the conjugated π system of the BChl g in the HbRC, and would therefore have little effect on its redox or optical properties. Even a deprotonated imidazole can donate a H-bond, and a phenol can serve as an axial ligand, so it is not clear if a His or a Tyr was the ancestral residue before the change in ec2 orientation. In any case, the use of Tyr allows for a strong H-bond without introducing positive charge in the vicinity of P, which may explain the presence of a Tyr in all HbRCs known today (Supplementary Information, Fig. S1).
Prime chlorophylls (Chl′) in the special pair
The functional role of the prime isomer of (B)Chl (e.g., BChl g′ and Chl a′), which possesses reversed stereochemistry about the 132 carbon in ring E of the tetrapyrrole, in the special pair of Type I RCs has long been a question. The HbRC uses a BChl g′ homodimer as P800 and the GsbRC is predicted to use a BChl a′ homodimer as P840 (Prince et al. 1985; Kobayashi et al. 1992; Ohashi et al. 2010; Sarrou et al. 2012). Chl a- and Chl d-containing PSI use a Chl a/Chl a′ or Chl d/Chl d′ heterodimer as P700 or P740, respectively (Fromme et al. 2001; Akiyama et al. 2002; Ohashi et al. 2010). Therefore, in Type I RCs, the homodimeric variety employs homodimeric BChl′/BChl′ as the special pair, while the heterodimeric variety employs heterodimeric Chl/Chl′ as the special pair. Initial high-resolution X-ray crystal structures of PSI revealed that the P700 Chl a′ participates in a H-bonding network. Specifically, these H-bonds are from PsaA-Tyr735 (to the phytyl chain ester group), PsaA-Thr743 (to the 131-keto O), and a water molecule (to the C-132 ester group). This water molecule may additionally interact with PsaA residues Ser607, Thr743, Tyr603, and Gly739. Because H-bonding can influence the spin density distribution of excited states, it was initially hypothesized that this could bias ET toward one branch of the ET chain (Jordan et al. 2001; Fromme et al. 2001). However, this idea was cast into doubt after it was shown that both branches of PSI were active in electron transfer (Guergova-Kuras et al. 2001; Muhiuddin et al. 2001; Poluektov et al. 2005; Santabarbara et al. 2005) and that mutation of PsaA-Thr743 to Ala did not affect directionality of ET (Li et al. 2004).
The situation is different in the HbRC: the two BChl g′ molecules of P800 do not participate in any H-bonding whatsoever. This indicates that the H-bonding environment around the special pair neither dictates, nor is required for, the presence of the prime epimer. The best explanation for the presence of the prime epimer is minimization of steric clash; the (B)Chls′ simply fit better into the cavity afforded for the special pair at the Type I RC dimer interface. In the HbRC, if the P800 BChl g′ molecules were replaced with BChl g molecules, the 132-methoxycarbonyl group would likely sterically clash with residues of TMH 11, particularly Thr598 and Cys601. This conclusion is reinforced by the observation of a second BChl g′ per PshA in the HbRC structure, which is found in the antenna domain and is coordinated by His36 of TMH1. There is also no H-bonding between this BChl g′ and the protein. However, if a normal BChl g were in this site, its 132-methoxycarbonyl group would likely clash with a neighboring antenna BChl g that is coordinated by a water molecule via Lys596 of TMH11. Furthermore, if the Chl a′ of P700 in PSI were a Chl a, it would likely clash with Phe598 of TMH9. As shown in Fig. 7, with the change in ec2 orientation in Type I RCs comes a subtle change in the orientation of the TMHs. The changes in the orientation of TMH9 and TMH11 specifically will affect the binding of P. We believe that the result of these changes is that the prime epimer is best accommodated in the P site, due to steric effects.
No enzyme responsible for making the C132 epimer of any (B)Chl has been identified. An intriguing hypothesis put forward by Webber and Lubitz (Webber and Lubitz 2001) was that the PSI polypeptide itself catalyzed the conversion of Chl a to a Chl a′, which would then be stabilized by optimal H-bonding to the nearby water molecule coordinated by several PsaA residues (Tyr603, Ser607, Thr743, Gly739). Residue PsaA-Thr743 would play a crucial role in the postulated mechanism, as the H-bond it donates to the 131-keto oxygen could stabilize an enolate intermediate after deprotonation of C132. However, mutation of this Thr to Ala did not result in loss of Chl a′ (Li et al. 2004), largely disproving this hypothesis. The simplest explanation is that (B)Chl synthesis can result in a small amount of the C132 epimer. The enzyme responsible for closure of ring E might even be capable of interconverting the two isomers at a low rate. However, when the C132 epimer is made, selection for it at a specific site would be according to whichever isomer fits best. Although both epimers could compete for any given (B)Chl site, the low amount of the prime C132 epimer would result in it losing this competition except for those sites where the much more abundant (non-prime) epimer could not bind.
With all of these lines of evidence taken together, we believe that there is no energetic role for (B)Chl′. There is only a structural role—and a relatively trivial one. We also believe it likely that the LCA of Type I RCs (LCA-RC1) contained two (B)Chl′ in its special pair, driven by the change in ec2 orientation. Thus, if the ARC contained the same ec2 orientation as modern Type II RCs, then it likely did not contain prime (B)Chls as P. The shift to use of the C132 epimer in the P site would have been an automatic consequence of the change in ec2 orientation and concomitant rearrangement of the ET domain TMHs.
The evolution of photosystem I was profoundly affected by the rise of atmospheric oxygen
Now that we have explained the changes in the ET cofactors that are common to all Type I RCs, we will move on to discussing the changes that resulted in the split between the aerobic PSI and anaerobic HbRC/GsbRC/CabRC lineages. The key to understanding these changes lies in the need to deal with dioxygen, which was the product of an ancestor of PSII that had acquired the oxygen-evolving complex (OEC).
The rise of dioxygen forced most anaerobic organisms containing a homodimeric Type I RC to avoid aerobic conditions, due to the intense stress they would have experienced. This stress would have come in the form of singlet oxygen (1O2), produced from a reaction between O2, which has a triplet ground state (3O2), and the P triplet state (3P), which is generated by CR of P+ec3−. ET from ec3 to FX (τ ≈ 0.8 ns) is only ~ 20-fold faster than CR of P+ec3− (τ ≈ 20 ns). This ratio of forward:reverse ET rate is lower than any other ET step in Type I RCs and means that CR would occur in a small but significant fraction of CS events, even if FX were oxidized and available to accept an electron. Some organisms with Type I RCs that could not avoid aerobic conditions overcame this oxidative stress by modifying their Type I RC. We assume that most of these changes necessary to fashion PSI, especially the last ones, took place in the presence of a water-oxidizing Type II RC (either PSII or a close ancestor). We term the Type I RCs that followed this scenario “Ancestral PSI.” Every major ET-domain modification that occurred on the way from the LCA-RC1 to Ancestral PSI can be explained by the avoidance of singlet oxygen production, which is achieved by minimizing the P+ec3− CR reaction.
To successfully avoid CR of the P+ec3− state, the quinones in the LCA-RC1 were moved further into the interior of the RC to serve as tightly bound ET intermediates between ec3 and FX, losing mobile quinone reduction activity in the process. It must be emphasized that this change does not enhance the rate of ET from ec3 to FX; it is likely that the LCA-RC1 could already do this on the ~ 1-ns timescale, as it does in the HbRC now (Chauvet et al. 2013; Ferlez et al. 2016). In fact, the overall ET rate from P to FX is much slower in PSI using the tightly bound PhQ (tens-to-hundreds of ns) than it is in the HbRC (0.5 - 1 ns). Slowing the rate of forward ET may seem counter-intuitive from a gain-of-fitness perspective. However, the point of this change was not to increase the overall rate of ET to FX, but to minimize P+ec3− CR. The key is that CR of the P+Q− state in PSI does not produce a 3P700 triplet (Warren et al. 1993). Thus, once the electron is on the quinone, CR is “safe.” In order to complete the transition to permanently bound quinone, the protonation channels were lost so that the quinone could not be doubly-reduced. Note that if the quinone-binding site moved during this transition, the new site would be cut off from the proton channels automatically; afterwards the residues making up the channel could be lost slowly over time. The final effect is that the electron is always transferred rapidly from ec3 to the quinone, due to the short distance and very strong driving force. The system loses so much energy that the next step in PSI (ET from Q to FX) is almost isoenergetic or slightly uphill (Santabarbara et al. 2005). The fact that this last step slows the overall rate of ET from P to FX does not matter in the end, since the net electron throughput of the RC is not determined by this rate; it is instead limited by the diffusion rate of exogenous electron donors and acceptors to and from the RC.
A concerted movement of the ec3 and quinone sites during the transition from LCA-RC1 to PSI may not have been very difficult. Key residues responsible for binding these two cofactors are next to each other in the polypeptide sequence (Fig. 8). The quinone-binding site of PSI is made up of 3 key residues: a Trp that stacks aromatically with the naphthoquinone headgroup on one side, and a Phe and Leu side chain on the other side (Jordan et al. 2001). On TMH10, the axial ligand to ec3 (Ser/water in HbRC, Met in PSI) is right before the Phe (Met in the HbRC). On the surface helix immediately after TMH10, the H-bond donor to ec3 (Ser in the HbRC, Tyr in PSI) is right before the Trp (Arg in the HbRC). The Leu, which is on the loop immediately before TMH11, is present in the HbRC. Thus, one could imagine that only two mutations, each of which affected two neighboring residues, could have accomplished most of the work of pushing ec3 up slightly and creating a new quinone-binding site. The change of axial ligand from Ser to Met would have tuned the reduction potential of ec3 and allowed its subsequent move; the change of Met to Phe may not have affected the structure much at all. Using a longer residue as H-bond donor (Ser to Tyr) would have had the effect of pushing ec3 slightly further from the membrane surface, allowing the quinone to bind by aromatically stacking with the new Trp residue, with the Phe and Leu on the other side to provide a hydrophobic environment for the menaquinone headgroup.
As the residues in the two PhQ-binding sites in PSI are highly conserved between the PsaA and PsaB subunits, the burying and isolation of the quinone sites must have occurred in the homodimeric state, before diversification of the core polypeptide into PsaA and PsaB. It has been argued that the asymmetry of the quinone sites in PSI is due to removing the risk of 1O2 production: as the semiquinone in the PhQA site is lower in energy than in the PhQB site, CR from FX would precede through PhQA, subsequently tunneling to the ground state (Rutherford et al. 2012). We support this argument and would extend it further. It is likely that in the homodimeric state the quinone would have been more like PhQA (i.e., lower in energy in the semiquinone state). After formation of the heterodimer, one of the quinones was free to raise its energy, although the reason for this is presently unclear.
Ancestral PSI would have dealt with a second issue related to the presence of atmospheric oxygen: long-lived FX− states can generate superoxide (O2−), another reactive oxygen species (ROS). However, unlike singlet oxygen, a biological remediation pathway does exist to eliminate superoxide in the form of the enzyme superoxide dismutase. The product of this enzyme is hydrogen peroxide, which can be reduced to water by peroxidases or dismutated by catalase. In effect, Ancestral PSI sacrificed its ability to reduce mobile quinones to prevent the formation of an ROS that the cell could not remediate (singlet oxygen), in the process allowing occasional formation of a ROS that the cell could remediate (superoxide). The sacrifice was not a large one; the loss of this ability in PSI would have further strengthened the linear electron flow pathway from PSII (which could still reduce quinone) to NADPH.
A way to spatially separate the charge-separated state even further was to recruit a modified version of the ferredoxin acceptor as a permanently bound subunit. In Modern PSI, this is the PsaC subunit, which contains the FA/FB clusters and is bound to the PsaA/PsaB core. As one would expect, the crystal structures of PSI reveal that PsaC binds asymmetrically to the PsaA/PsaB heterodimer. Therefore, heterodimerization likely occurred in Ancestral PSI to encourage stronger binding of PsaC, resulting in Modern PSI. The result of this would be to further stabilize the CS state, increasing its lifetime to ~ 100 ms. Without the FA/FB clusters, if the acceptor pool were heavily reduced, the Q·−FX− state could eventually accumulate under high light, leading to detrimental CR of P+ec3−. With the additional two clusters, PSI would have to accumulate 4 electrons—on all 3 Fe–S clusters as well as one of the quinones—before P+ec3− CR would occur, making this situation very unlikely. It should be noted that this can occur under high light in anaerobic conditions, but the lack of atmospheric oxygen removes the danger of singlet oxygen or superoxide, thus explaining why an ancestor to the HbRC/GsbRC/CabRC did not fix their quinones or heterodimerize; they were not forced to acquire a mechanism to avoid ROS.
Recruitment of the antenna domain
Finally, we will consider a last piece of the evolutionary puzzle unrelated to the tuning of the redox function of the ET cofactors: the recruitment of the antennas. If we imagine that the ARC had an antenna domain fused to the N-terminus of the ET domain, similar to what is found in modern Type I RCs, this domain must have been lost in the PbRC lineage and split off into a separate subunit in the Ancestral PSII lineage. The latter change is a relatively minor one, but the former is a problem. What possible advantage would the ancestor to the PbRC have gained by losing so many antenna pigments? One cannot use the argument that this allowed the LH1 antenna complex to bind to it, as LH1 would not have been “waiting in the wings” for the antenna domain to be lost so it could associate with the ET domain. As mentioned before, we will avoid any step that results in a loss of fitness unless there is a compelling argument for it. There is no such argument here. It seems much more likely that the ARC lacked an antenna domain and the lineages that evolved from it developed their own antennas, which over time may have been transferred around via lateral gene transfer (LGT). In this final section, we will consider the LGT events in which the 6-TMH antenna domain found in HbRC/GsbRC/CabRC/PSI/PSII radiated, how the presence of antenna (B)Chls in the ET domain relates to the antenna domain radiation, and finally, the evolution and radiation of single-TMH antenna subunits.
LGT of the 6-TMH antenna domain
During analysis of the new HbRC structure, a curious finding was that the HbRC antenna domain and CP43/CP47 antenna structurally align with lower RMSD than the HbRC and PSI antenna domains do (Gisriel et al. 2017). There is also a distinct β-hairpin structure conserved between the HbRC, CP43, and CP47, on the donor side of the antenna domains, which does not appear in PSI (Supplementary Information, Fig. S6). This offers a surprising contrast with the superpositions of the ET domains and points to a more complicated evolutionary history than previously thought.
The simplest explanation we can propose is that Ancestral PSII gained its antenna domain via LGT from an ancient Type I RC that already contained the antenna domain (Fig. 4). The LGT event would have transferred a single gene for the antenna domain without fusing it to the Ancestral PSII core (which had already heterodimerized). Such an event would have occurred before the Ancestral PSII gained an OEC, because the antenna domain is a requisite partner for binding the OEC in Modern PSII. From there, the unfused antenna polypeptide of Ancestral PSII would have later diverged into modern CP43 and CP47.
If the Type I RC ancestor serving as the donor of the antenna domain transferred the antenna domain to Ancestral PSII before the split of the homodimeric Type I RCs from PSI, it would imply that both the HbRC antenna domain and CP43/CP47 have structurally diverged little while the PSI antenna domain has structurally diverged more, a scenario for which we have no justification. If, however, the ancient Type I RC that transferred the antenna domain did so after the split of the homodimeric Type I RCs and PSI, and was an ancestor in the HbRC/GsbRC/CabRC lineage, it would provide a good explanation as to why the HbRC antenna domain and CP43/CP47 remain relatively similar.
An alternative evolutionary scheme in which the ancestral antenna originated in Ancestral PSII and was transferred via LGT to a lineage containing Proto-RC1 is less compelling. Due to the distribution of the antenna domains, the ancient LGT event would have had to occur from Ancestral PSII before the Type I lineages split. As explained above, the fact that the HbRC antenna and CP43/CP47 have structurally diverged little, while PSI has structurally diverged more, cannot be easily explained by this scenario. Secondly, the pattern of (B)Chls that are bound by the ET domain but function as antenna pigments (discussed below) does not support this scenario.
Antenna (B)Chl sites in the ET domains
There are several antenna (B)Chls in the Type I RCs that are coordinated by residues in the ET domain (Fig. 9). In the HbRC, 7 BChl g molecules are bound in this fashion by each PshA polypeptide. Almost twice as many Chl a are bound by the ET domains of PsaA (11) and PsaB (12) in cyanobacterial PSI; these include the intimately associated “red-absorbing” chlorophylls (Jordan et al. 2001), which are not found in the HbRC. In sharp contrast, the PSII core polypeptides (D1 and D2) each bind a single antenna Chl, called ChlZ/ChlD (or sometimes D1-ChlZ/D2-ChlZ) (Zouni et al. 2001). No antenna sites are found in the structures of the L/M heterodimer of the PbRC (Deisenhofer et al. 1985); the site of the His residue used to bind ChlZ/ChlD in PSII is generally a Phe or Cys residue in the PbRC. Based upon our sequence alignments, the Chloroflexi RC (CfxRC) and Gemmatimonadetes RC (GmRC) are not expected to contain the ChlD/ChlZ site either (Supplementary Information, Fig. S7). All antenna (B)Chls bound by the ET domains are shown in Fig. 9a.
There are four ET domain-bound antenna (B)Chl sites that are conserved in the HbRC and PSI in terms of sequence (axial ligand) and spatial position. One of these sites is analogous to ChlZ/ChlD in PSII. There is also a fifth site in which (B)Chls are in a similar position in the structure, but the coordinating residues are not homologous (Fig. 9b). Three of the five aforementioned sites are situated more closely to ec2, while the others lie more closely to ec3. We believe energy transfer from the antenna to ec3 to be unlikely in the HbRC because ec3 is a Chl a derivative, thus requiring an uphill step from the lower-energy BChl g antenna. The same situation is likely to hold true in the GsbRC and CabRC, since their antenna pigments are BChl a and ec3 is also a Chl a. Of course, this is not the case in PSI, where all the pigments are Chl a.
Since the ChlZ/ChlD site is conserved across the HbRC, GsbRC, PSI, and PSII, and lies most closely to the ec2 site, a logical conclusion is that it is the major bridging site that energetically connects the antenna domain (or CP43/CP47) to the ET chain, with energy transfer usually proceeding to the ET chain to initiate CS through the ec2 site, which does not require uphill energy transfer. Curiously, the sequence of the CabRC predicts a Trp residue at the ChlZ/D site, which may not ligate a BChl (see Supplementary Information, Fig. S7). Since none of the anoxygenic Type II RCs (which lack the 6-TMH antenna domain) contain a ChlD/ChlZ-type bridging site, but PSII and most known Type I RCs do, a simple evolutionary scenario to explain this is that the ChlD/ChlZ site was present in the ARC and predates the Type I/II split. This site may have been important for either aiding in energy transfer from a now-extinct primordial antenna complex or increasing photon absorption by the ET domain. Following this scenario, this antenna site was lost in the last common ancestor of the PbRC/CfxRC/GmRC. The drive to lose this site may have been the recruitment of an ancient LH1-like antenna. In the three-dimensional structures from extant PbRCs, the ChlD/ChlZ cofactor occupies the same space as an LH1 polypeptide; inclusion of a BChl at this position would likely clash with LH1 and inhibit its association with the RC (Supplementary Information, Fig. S7). Considering that the closest distance between any BChl in LH1 and the ET core in the PbRC is about 11 Å longer than the distance between the ChlD/ChlZ site and the ET core, the ancestor of the PbRC may have sacrificed a close distance between antenna pigments and the ET core to gain a large increase in antenna size. The ChlD/ChlZ site was probably advantageous for Ancestral PSII to maintain, if only for using it as an antenna site. This advantage was greatly magnified after the recruitment of ancestral CP43/CP47 and the OEC, to enhance energy transfer efficiency to the ET cofactors (Schelvis et al. 1994; Lince and Vermaas 1998) and to act in a photoprotective role in oxygenic conditions (Schweitzer and Brudvig 1997; Schweitzer et al. 1998).
The larger number of antenna sites in the Type I ET domains (i.e., 6–12) implies that the antenna domain has been present in this lineage longer, enough time for the Type I RCs to add more bridging sites between ET and antenna domains. If PSII was the progenitor of the antenna domain, one would expect that it would have evolved more ET domain-bound antenna sites to aid in energy transfer to the ET core. Therefore, according to our evolutionary scheme, the antenna domain was gained by an ancestor of all modern Type I RCs, after the acquisition of FX but before the diversification of the HbRC/GsbRC/CabRC and PSI. The first complex labeled in our evolutionary scheme to contain this domain is LCA-RC1. After the split between the lineages that led to modern homodimeric RCs and PSI, an ancestor to modern homodimeric RCs laterally transferred this antenna domain to an ancestor of PSII.
Evolution of single-TMH extrinsic antenna subunits
A surprising discovery in the recent HbRC structure was the presence of two single-TMH subunits, one on each side of the core. This polypeptide, named PshX, had not previously been documented. Its primary sequence was identified by crystallography and mass spectrometry (Gisriel et al. 2017). Attempts to identify a homolog to the pshX gene in the Heliorestis convoluta draft genome have so far been unsuccessful. The single-TMH subunits PsaI and PsaJ are present in PSI in the same general position, but no sequence homology can be seen between these polypeptides and PshX. Interestingly, the gene encoding PshX lies outside of the photosynthetic gene cluster, which contains pshA, the genes for cyt c553, the cyt bc complex, and pigment biosynthesis. This suggests that the gene encoding PshX may not have been transferred with the other genes in the cluster and was an independent evolutionary invention of the Heliobacteria. This implies that small transmembrane subunits may be cases of convergent evolution. Its small size (31 residues) makes its de novo evolution conceivable. Genomic DNA sequences capable of encoding hydrophobic peptides of sufficient length to cross the membrane are not very rare (Hemm et al. 2008). If one of these had some affinity for the HbRC and managed to render the complex more stable in some way, then it would be selected for. The same sort of selection may well explain the numerous small hydrophobic subunits at the periphery of PSI, PSII, cyt b6f, NADH dehydrogenase, etc. Often, deletion of such subunits results in a lower steady-state level of the complex due to more rapid degradation [e.g., PsaJ in eukaryotic PSI (Hansson et al. 2007)]. In any case, the PshX subunit found today has two antenna pigment (BChl g)-binding sites, and thus contributes 4 of the 54 antenna BChls (i.e., 7.4%). The idea that short transmembrane polypeptides may be recruited to the RC rather easily, and thus exhibit convergent evolution, causes us to remain uncommitted as to whether the LCA of Type I RCs may have exhibited transmembrane subunits other than the dimeric core.
Conclusion and outlook
In conclusion, the recent 2.2-Å crystal structure of the HbRC and its unexpected quinone reduction activity have compelled us to re-evaluate our thinking on the evolution of the photochemical RCs. Our proposed evolutionary trajectory rests upon the hypothesis that the last common ancestor of all RCs was a homodimeric complex that functioned to reduce quinone to quinol using an inefficient radical disproportionation reaction, allowing for a new innovation in the biosphere: light-driven cyclic electron flow. All of the diversification presently observed across the disparate phototrophic taxa are attributable to the need to, first, optimize this reaction, and much later, adjust to rising oxygen concentrations. We acknowledge the limitations inherent in trying to understand protein-level changes that occurred over three billion years ago. However, we believe that our approach, which considers every functional aspect of the RC, can help fill in informational gaps that genetic and structural data alone cannot fill. Recent advances in structural biology methods allow for optimism that representative RC structures from more phototrophic groups will eventually be solved, and recent advances in metagenomic analyses allow for optimism that new and interesting phototrophs will be discovered. Marrying these future findings with our underlying functional analysis will help to further refine this evolutionary picture.
Sequence alignments and structural visualization
For non-structure-based sequence alignments, each sequence was retrieved from the Uniprot database (The UniProt Consortium 2017), and Clustal Omega (Sievers et al. 2011) was used for initial sequence alignments. For structure-based sequence alignments, the RC structures were gathered from the protein data bank (PDB) (Berman et al. 2000), trimmed according to the domain of interest in PyMOL (2014), and all amino acids were changed into alanines using Coot (Emsley et al. 2010). Table S1 (Supplementary Information) provides a list of all structures used for the structure-based phylogenetic analysis for the ET domains. Two structure-based MSAs were constructed for each dataset: one from the PROMALS3D server (Pei et al. 2008) and one from the Protein Data Bank in Europe PDBeFOLD server (Krissinel and Henrick 2005). Together with the non-structure-based method, this resulted in a total of 3 MSAs (two structure-based and one sequence-based). When showing the three-dimensional overlay of RC structures, the super function of PyMOL was used. When comparing RC polypeptides that do not have available structures (such as the CfxRC), these sequences were fetched from the Uniprot database.
MSA trimming and model selection
We applied 3 different data trimming strategies to each MSA: no gap deletion, 50% site coverage cutoff, and full deletion for positions containing gaps. To determine which model should be used to assess the evolutionary relationships between the non-structure-based MSAs (Clustal Omega) and structure-based MSAs (PROMALS3D and PDBeFOLD), the “Find Best Protein Model” function of the MEGA7 software (Kumar et al. 2016), which ranks 56 protein substitution models according to Bayesian and Akaike information criteria and the Maximum Likelihood fit, was used. The top three amino acid substitution models for each MSA were variations of LG (Le and Gascuel 2008) and WAG (Whelan and Goldman 2001) (Supplementary Information, Table S6). The three models for each dataset were identified with the top-scoring specification (Whelan and Goldman 2001) with or without 4 gamma-distributed rate categories, and with or without proportions of invariant sites or amino acid frequencies from the data. Phylogenetic trees were built under each of the top three models, for a total of 27 phylogenetic trees (3 MSAs times 3 trimming methods times 3 best substitution models; Supplementary Information, Table S7). Support for nodes in these 27 trees were assessed by 100 bootstrap replicates.
The resulting 3 groups of 9 trees were examined for topological variation and a tree that exhibited the most common and well-supported topology was chosen to represent the overall data and re-created with 1000 bootstrap replicates: one for the non-structure-based tree (Fig. 3a) and one for the structure-based tree (Supplementary Information, Fig. S4). For all trees, a pairwise distance matrix was estimated with a JTT model, and the Neighbor-Join and BioNJ algorithms were applied to identify the initial tree for the heuristic search. The final non-structure-based tree was made with the WAG model (Whelan and Goldman 2001) with frequencies (+ F) and gamma-distributed rates (+ G, 4 categories, parameter = 8.1159), which exhibited a highest log likelihood value of -7448.88, and the structure-based tree was made from the PROMALS3D MSA with the LG model (Le and Gascuel 2008) with frequencies (+ F), gamma-distributed rates (+ G, 4 categories, parameter = 5.6588), and invariant sites (+ I), which exhibited a maximum log likelihood value of − 8785.34.
Phenetic dendrogram construction
The 19 individual trimmed structures (see “Sequence alignments and structural visualization” section above) were superimposed onto one another in pairwise fashion, biased by TMHs, using the super function in PyMOL (2014) and a triangular RMSD matrix was created (for the PyMOL script formula, see Supplementary Information, Table S8). The matrix was converted into MEGA format, read into MEGA7 (Kumar et al. 2016), and a phenetic dendrogram was inferred using the UPGMA method (Sneath and Sokal 1973). The optimal dendrogram with the sum of branch length = 9.821 is shown. The dendrogram is drawn to scale, with branch lengths in the same units as those of the RMSD distances used to infer it. The UPGMA method automatically assumes a root at the deepest node in order to calibrate the RMSD distance scale.
The authors thank Dr. Gillian Gile (Arizona State University) for consultation on phylogenetic tree construction, Mr. Bill Johnson (Arizona State University) for artistic input on figures, and Dr. Robert Blankenship (Washington University in St. Louis) for sharing the draft genome of Heliorestis convoluta. All authors were supported by the Division of Chemical Sciences, Geosciences, and Biosciences, Office of Basic Energy Sciences, of the U.S. Department of Energy through Grant DE-SC0010575.
- Akiyama M, Miyashita H, Kise H et al (2002) Quest for minor but key chlorophyll molecules in photosynthetic reaction centers—unusual pigment composition in the reaction centers of the chlorophyll d-dominated cyanobacterium Acaryochloris marina. Photosynth Res 74:97–107. https://doi.org/10.1023/A:1020915506409 CrossRefPubMedGoogle Scholar
- Blankenship RE (2014) Molecular mechanisms of photosynthesis, 2nd edn. Wiley, ChicesterGoogle Scholar
- Bryant DA, Liu Z, Li T et al (2012) Comparative and functional genomics of anoxygenic green bacteria from the taxa Chlorobi, Chloroflexi, and Acidobacteria. In: Burnap R, Vermaas W (eds) Functional genomics and evolution of photosynthetic systems. Springer, Dordrecht, pp 47–102CrossRefGoogle Scholar
- Cardona T, Sanchez-Baracaldo P, Rutherford AW, Larkum AWD (2017) Molecular evidence for the early evolution of photosynthetic water oxidation. https://doi.org/10.1101/109447
- Diner BA, Petrouleas V, Wendoloski JJ (1991) The iron-quinone electron-acceptor complex of photosystem II. Physiol Plant 81:423–436. https://doi.org/10.1111/j.1399-3054.1991.tb08753.x CrossRefGoogle Scholar
- Doolittle RF (1986) Of URFs and ORFs: a primer on how to analyze derived amino acid sequences. University Science Books, Mill ValleyGoogle Scholar
- Fischer WW, Hemp J, Johnson JE (2016) Evolution of oxygenic photosynthesis. Annu Rev Earth Planet Sci 44:647–683. https://doi.org/10.1146/annurev-earth-060313-054810 CrossRefGoogle Scholar
- Furbacher PN, Tae GS, Cramer WA (1996) Evolution and origins of the cytochrome bc1 and b6f complexes. In: Baltscheffsky H (ed) Origin and evolution of biological energy conservation. Wiley, New York, pp 221–253Google Scholar
- Granick S (1957) Speculations on the origins and evolution of photosynthesis. Ann N Y Acad Sci 69:292–308. https://doi.org/10.1111/j.1749-6632.1957.tb49665.x CrossRefPubMedGoogle Scholar
- Hohmann-Marriott MF, Blankenship RE (2011) Evolution of photosynthesis. Annu Rev Plant Biol 62:515–548. https://doi.org/10.1146/annurev-arplant-042110-103811 CrossRefPubMedGoogle Scholar
- Kirmaier C, Holten D, Parson WW (1985) Picosecond-photodichroism studies of the transient states in Rhodopseudomonas sphaeroides reaction centers at 5 K. Effects of electron transfer on the six bacteriochlorin pigments. Biochim Biophys Acta 810:49–61. https://doi.org/10.1016/0005-2728(85)90205-1 CrossRefGoogle Scholar
- Kirmaier C, Laporte L, Schenck CC, Holten D (1995a) The nature and dynamics of the charge-separated intermediate in reaction centers in which bacteriochlorophyll replaces the photoactive bacteriopheophytin. 1. Spectral characterization of the transient state. J Phys Chem 99:8903–8909. https://doi.org/10.1021/j100021a067 CrossRefGoogle Scholar
- Kirmaier C, Laporte L, Schenck CC, Holten D (1995b) The nature and dynamics of the charge-separated intermediate in reaction centers in which bacteriochlorophyll replaces the photoactive bacteriopheophytin. 2. The rates and yields of charge separation and recombination. J Phys Chem 99:8910–8917. https://doi.org/10.1021/j100021a068 CrossRefGoogle Scholar
- Kobayashi M, van de Meent EJ, Oh-oka H et al (1992) Pigment composition of heliobacteria and green sulfur bacteria. In: Murata N (ed) Research in photosynthesis, vol 1. Kluwer, Dordrecht, pp 393–396Google Scholar
- Krissinel E, Henrick K (2005) Multiple alignment of protein structures in three dimensions. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 67–78Google Scholar
- Muhiuddin IP, Heathcote P, Carter S et al (2001) Evidence from time resolved studies of the P700+ /A1—radical pair for photosynthetic electron transfer on both the PsaA and PsaB branches of the photosystem I reaction centre. FEBS Lett 503:56–60. https://doi.org/10.1016/S0014-5793(01)02696-5 CrossRefPubMedGoogle Scholar
- Müller MG, Niklas J, Lubitz W, Holzwarth AR (2003) Ultrafast transient absorption studies on photosystem I reaction centers from Chlamydomonas reinhardtii. 1. A new interpretation of the energy trapping and early electron transfer steps in photosystem I. Biophys J 85:3899–3922. https://doi.org/10.1016/S0006-3495(03)74804-8 CrossRefPubMedPubMedCentralGoogle Scholar
- Nelson N, Yocum CF (2006) Structure and function of photosystems I and II. Annu Rev Plant Biol 57:521–565. https://doi.org/10.1146/annurev.arplant.57.032905.105350 CrossRefPubMedGoogle Scholar
- Nitschke W, Mattioli T, Rutherford AW (1996) The Fe-S type photosystems and the evolution of photosynthetic reaction centers. In: Baltscheffsky H (ed) Origin and evolution of biological energy conservation. VCH, New York, pp 177–203Google Scholar
- Okamura MY, Feher G (1992) Proton transfer in reaction centers from photosynthetic bacteria. Annu Rev Biochem 61:861–896. https://doi.org/10.1146/annurev.bi.61.070192.004241 CrossRefPubMedGoogle Scholar
- PyMOL (2014) The PyMOL molecular graphics system, version 1.8. Schrödinger, LLC. https://pymol.org
- Rutherford AW, Mattiolo TA, Nitschke W (1996) The FeS-type photosystems and the evolution of photosynthetic reaction centers. In: Baltscheffsky H (ed) Origin and evolution of biological energy conversion. VCH, New York, pp 177–204Google Scholar
- Santabarbara S, Heathcote P, Evans MCW (2005) Modelling of the electron transfer reactions in photosystem I by electron tunnelling theory: the phylloquinones bound to the PsaA and the PsaB reaction centre subunits of PS I are almost isoenergetic to the iron-sulfur cluster FX. Biochim Biophys Acta 1708:283–310CrossRefGoogle Scholar
- Sneath PHA, Sokal RR (1973) Numerical taxonomy. Freeman, San FranciscoGoogle Scholar
- Whelan S, Goldman N (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 18:691–699. https://doi.org/10.1093/oxfordjournals.molbev.a003851 CrossRefPubMedGoogle Scholar