Abstract
An unresolved question in the origin and evolution of life is whether a continuous path from geochemical precursors to the majority of molecules in the biosphere can be reconstructed from modern-day biochemistry. Here we identified a feasible path by simulating the evolution of biosphere-scale metabolism, using only known biochemical reactions and models of primitive coenzymes. We find that purine synthesis constitutes a bottleneck for metabolic expansion, which can be alleviated by non-autocatalytic phosphoryl coupling agents. Early phases of the expansion are enriched with enzymes that are metal dependent and structurally symmetric, supporting models of early biochemical evolution. This expansion trajectory suggests distinct hypotheses regarding the tempo, mode and timing of metabolic pathway evolution, including a late appearance of methane metabolisms and oxygenic photosynthesis consistent with the geochemical record. The concordance between biological and geological analyses suggests that this trajectory provides a plausible evolutionary history for the vast majority of core biochemistry.
Similar content being viewed by others
Data availability
The analyses presented in this study are based on data retrieved from the KEGG database. Due to licensing agreements and proprietary restrictions associated with the KEGG database, we are unable to provide the raw data directly. However, data can be accessed through the KEGG database (https://www.kegg.jp/), subject to their respective licensing policies and agreements. Processed datasets are available in Supplementary Tables 1–15 and via the following GitHub repository: https://github.com/jgoldford/metabolic-continuity.
Code availability
Code is available via the following GitHub repository: https://github.com/jgoldford/metabolic-continuity
References
Smith, E. & Morowitz, H. J. The Origin and Nature of Life on Earth 677 (Cambridge Univ. Press, 2016).
Smith, E. & Morowitz, H. J. Universality in intermediary metabolism. Proc. Natl Acad. Sci. USA 101, 13168–13173 (2004).
Hartman, H. Speculations on the origin and evolution of metabolism. J. Mol. Evol. 4, 359–370 (1975).
Goldford, J. E. & Segrè, D. Modern views of ancient metabolic networks. Curr. Opin. Syst. Biol. 8, 117–124 (2018).
Patel, B. H., Percivalle, C., Ritson, D. J., Duffy, C. D. & Sutherland, J. D. Common origins of RNA, protein and lipid precursors in a cyanosulfidic protometabolism. Nat. Chem. 7, 301–307 (2015).
Xu, J. et al. Selective prebiotic formation of RNA pyrimidine and DNA purine nucleosides. Nature 582, 60–66 (2020).
Yadav, M., Pulletikurti, S., Yerabolu, J. R. & Krishnamurthy, R. Cyanide as a primordial reductant enables a protometabolic reductive glyoxylate pathway. Nat. Chem. 14, 170–178 (2022).
Xavier, J. C., Hordijk, W., Kauffman, S., Steel, M. & Martin, W. F. Autocatalytic chemical networks at the origin of metabolism. Proc. Biol. Sci. 287, 20192377 (2020).
Xavier, J. C. & Kauffman, S. Small-molecule autocatalytic networks are universal metabolic fossils. Philos. Trans. R. Soc. A 380, 20210244 (2022).
Kauffman, S. A. The Origins of Order: Self-Organization and Selection in Evolution 709 (Oxford Univ. Press, 1993).
Kauffman, S. A. Autocatalytic sets of proteins. J. Theor. Biol. 119, 1–24 (1986).
Blokhuis, A., Lacoste, D. & Nghe, P. Universal motifs and the diversity of autocatalytic systems. Proc. Natl Acad. Sci. USA 117, 25230–25236 (2020).
Petchey, O. L. & Gaston, K. J. Extinction and the loss of functional diversity. Proc. Biol. Sci. 269, 1721–1727 (2002).
Marshall, C. R. Five palaeobiological laws needed to understand the evolution of the living biota. Nat. Ecol. Evol. 1, 0165 (2017).
Williams, T. A. et al. Integrative modeling of gene and genome evolution roots the archaeal tree of life. Proc. Natl Acad. Sci. USA 114, E4602–E4611 (2017).
Coleman, G. A. et al. A rooted phylogeny resolves early bacterial evolution. Science 372, 6542 (2021).
Raymond, J. & Segrè, D. The effect of oxygen on biochemical networks and the evolution of complex life. Science 311, 1764–1767 (2006).
Goldford, J. E., Hartman, H., Smith, T. F. & Segrè, D. Remnants of an ancient metabolism without phosphate. Cell 168, 1126–1134.e9 (2017).
Goldford, J. E., Hartman, H., Marsland, R. 3rd & Segrè, D. Environmental boundary conditions for the origin of life converge to an organo-sulfur metabolism. Nat. Ecol. Evol. 3, 1715–1724 (2019).
Tian, T. et al. Phosphates as energy sources to expand metabolic networks. Life 9, 43 (2019).
Lazcano, A. & Miller, S. L. On the origin of metabolic pathways. J. Mol. Evol. 49, 424–431 (1999).
Teichmann, S. A. et al. Small-molecule metabolism: an enzyme mosaic. Trends Biotechnol. 19, 482–486 (2001).
Teichmann, S. A. et al. The evolution and structural anatomy of the small molecule metabolic pathways in Escherichia coli. J. Mol. Biol. 311, 693–708 (2001).
Rison, S. C. G. & Thornton, J. M. Pathway evolution, structurally speaking. Curr. Opin. Struct. Biol. 12, 374–382 (2002).
Muchowska, K. B., Varma, S. J. & Moran, J. Nonenzymatic metabolic reactions and life’s origins. Chem. Rev. 120, 7708–7744 (2020).
Muto-Fujita, A. A novel model for the chemical evolution of metabolic networks. CICSJ Bulletin 37, 57–62 (2019).
Maeda, H. A. & Fernie, A. R. Evolutionary history of plant metabolism. Annu. Rev. Plant Biol. 72, 185–216 (2021).
Garcia, A. K., Cavanaugh, C. M. & Kacar, B. The curious consistency of carbon biosignatures over billions of years of Earth-life coevolution. ISME J. 15, 2183–2194 (2021).
Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Ebenhöh, O., Handorf, T. & Heinrich, R. Structural analysis of expanding metabolic networks. Genome Inform. 15, 35–45 (2004).
Handorf, T., Ebenhöh, O. & Heinrich, R. Expanding metabolic networks: scopes of compounds, robustness, and evolution. J. Mol. Evol. 61, 498–512 (2005).
Kim, H., Smith, H. B., Mathis, C., Raymond, J. & Walker, S. I. Universal scaling across biochemical networks on Earth. Sci. Adv. 5, eaau0149 (2019).
Martin, W. & Russell, M. J. On the origin of biochemistry at an alkaline hydrothermal vent. Philos. Trans. R. Soc. Lond. B 362, 1887–1926 (2007).
Camprubi, E., Jordan, S. F., Vasiliadou, R. & Lane, N. Iron catalysis at the origin of life. IUBMB Life 69, 373–381 (2017).
Muchowska, K. B., Varma, S. J. & Moran, J. Synthesis and breakdown of universal metabolic precursors promoted by iron. Nature 569, 104–107 (2019).
Varma, S. J., Muchowska, K. B., Chatelain, P. & Moran, J. Native iron reduces CO2 to intermediates and end-products of the acetyl-CoA pathway. Nat. Ecol. Evol. 2, 1019–1024 (2018).
Springsteen, G., Yerabolu, J. R., Nelson, J., Rhea, C. J. & Krishnamurthy, R. Linked cycles of oxidative decarboxylation of glyoxylate as protometabolic analogs of the citric acid cycle. Nat. Commun. 9, 91 (2018).
Muchowska, K. B. et al. Metals promote sequences of the reverse Krebs cycle. Nat. Ecol. Evol. 1, 1716–1721 (2017).
Hadadi, N., Hafner, J., Shajkofci, A., Zisaki, A. & Hatzimanikatis, V. ATLAS of Biochemistry: a repository of all possible biochemical reactions for synthetic biology and metabolic engineering studies. ACS Synth. Biol. 5, 1155–1166 (2016).
Hafner, J., MohammadiPeyhani, H., Sveshnikova, A., Scheidegger, A. & Hatzimanikatis, V. Updated ATLAS of Biochemistry with new metabolites and improved enzyme prediction power. ACS Synth. Biol. 9, 1479–1482 (2020).
Kappock, T. J., Ealick, S. E. & Stubbe, J. Modular evolution of the purine biosynthetic pathway. Curr. Opin. Chem. Biol. 4, 567–572 (2000).
Kun, A., Papp, B. & Szathmáry, E. Computational identification of obligatorily autocatalytic replicators embedded in metabolic networks. Genome Biol. 9, R51 (2008).
Whicher, A., Camprubi, E., Pinna, S., Herschy, B. & Lane, N. Acetyl phosphate as a primordial energy currency at the origin of life. Orig. Life Evol. Biosph. 48, 159–179 (2018).
Harrison, S. A. & Lane, N. Life as a guide to prebiotic nucleotide synthesis. Nat. Commun. 9, 5176 (2018).
Pinna, S. et al. A prebiotic basis for ATP as the universal energy currency. PLoS Biol. 20, e3001437 (2022).
Catling, D. C. & Zahnle, K. J. The Archean atmosphere. Sci. Adv. 6, eaax1420 (2020).
Fischer, W. W., Hemp, J. & Johnson, J. E. Evolution of oxygenic photosynthesis. Ann. Rev. Earth Planet. Sci. 44, 647–683 (2016).
Marshall, S. M. et al. Identifying molecules as biosignatures with assembly theory and mass spectrometry. Nat. Commun. 12, 3033 (2021).
Fontecilla-Camps, J. C. Geochemical continuity and catalyst/cofactor replacement in the emergence and evolution of life. Angew. Chem. Int. Ed. Engl. 58, 42–48 (2019).
Milner-White, E. J. & Russell, M. J. Functional capabilities of the earliest peptides and the emergence of life. Genes 2, 671–688 (2011).
Goldman, A. D. & Kacar, B. Cofactors are remnants of life’s origin and early evolution. J. Mol. Evol. https://doi.org/10.1007/s00239-020-09988-4 (2021)
Longo, L. M. et al. Primordial emergence of a nucleic acid-binding protein via phase separation and statistical ornithine-to-arginine conversion. Proc. Natl Acad. Sci. USA 117, 15731–15739 (2020).
Nitschke, W., McGlynn, S. E., Milner-White, E. J. & Russell, M. J. On the antiquity of metalloenzymes and their substrates in bioenergetics. Biochim. Biophys. Acta 1827, 871–881 (2013).
Saito, M. A., Sigman, D. M. & Morel, F. M. M. The bioinorganic chemistry of the ancient ocean: the co-evolution of cyanobacterial metal requirements and biogeochemical cycles at the Archean–Proterozoic boundary? Inorg. Chim. Acta 356, 308–318 (2003).
Johnson, J. E. & Molnar, P. H. Widespread and persistent deposition of iron formations for two billion years. Geophys. Res. Lett. 46, 3327–3339 (2019).
Ma, B.-G. et al. Characters of very ancient proteins. Biochem. Biophys. Res. Commun. 366, 607–611 (2008).
Goldman, A. D., Beatty, J. T. & Landweber, L. F. The TIM barrel architecture facilitated the early evolution of protein-mediated metabolism. J. Mol. Evol. 82, 17–26 (2016).
Nath, N., Mitchell, J. B. O. & Caetano-Anollés, G. The natural history of biocatalytic mechanisms. PLoS Comput. Biol. 10, e1003642 (2014).
Aziz, M. F., Caetano-Anollés, K. & Caetano-Anollés, G. The early history and emergence of molecular functions and modular scale-free network behavior. Sci. Rep. 6, 25058 (2016).
Eck, R. V. & Dayhoff, M. O. Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science 152, 363–366 (1966).
Lee, J. & Blaber, M. Experimental support for the evolution of symmetric protein architecture from a simple peptide motif. Proc. Natl Acad. Sci. USA 108, 126–130 (2011).
Smock, R. G., Yadid, I., Dym, O., Clarke, J. & Tawfik, D. S. De novo evolutionary emergence of a symmetrical protein is shaped by folding constraints. Cell 164, 476–486 (2016).
Yagi, S. et al. Seven amino acid types suffice to create the core fold of RNA polymerase. J. Am. Chem. Soc. 143, 15998–16006 (2021).
Seal, M. et al. Peptide–RNA coacervates as a cradle for the evolution of folded domains. J. Am. Chem. Soc. 144, 14150–14160 (2022).
Wierenga, R. K. The TIM-barrel fold: a versatile framework for efficient enzymes. FEBS Lett. 492, 193–198 (2001).
Longo, L. M. et al. On the emergence of P-Loop NTPase and Rossmann enzymes from a beta–alpha–beta ancestral fragment. eLife 9, e64415 (2020).
Bliven, S. E. et al. Analyzing the symmetrical arrangement of structural repeats in proteins with CE-Symm. PLoS Comput. Biol. 15, e1006842 (2019).
Granick, S. in Evolving Genes and Proteins (eds Bryson, V. & Vogel, H. J.) 67–88 (Academic Press, 1965).
Horowitz, N. H. On the evolution of biochemical syntheses. Proc. Natl Acad. Sci. USA 31, 153–157 (1945).
Fuchs, G. Alternative pathways of carbon dioxide fixation: insights into the early evolution of life? Annu. Rev. Microbiol. 65, 631–658 (2011).
Braakman, R. & Smith, E. The emergence and early evolution of biological carbon-fixation. PLoS Comput. Biol. 8, e1002455 (2012).
Weiss, M. C. et al. The physiology and habitat of the last universal common ancestor. Nat. Microbiol. 1, 16116 (2016).
Garcia, P. S., Gribaldo, S. & Borrel, G. Diversity and evolution of methane-related pathways in Archaea. Annu. Rev. Microbiol. 76, 727–755 (2022).
Krissansen-Totton, J., Buick, R. & Catling, D. C. A statistical analysis of the carbon isotope record from the Archean to Phanerozoic and implications for the rise of oxygen. Am. J. Sci. 315, 275–316 (2015).
Hayes, J. M. Global methanotrophy at the Archean–Proterozoic transition. In Early Life on Earth. Nobel Symposium (ed. Bengston, S.) No. 84, 220–236 (Columbia Univ. Press, 1994).
Slotznick, S. P. & Fischer, W. W. Examining Archean methanotrophy. Earth Planet. Sci. Lett. 441, 52–59 (2016).
Prywes, N., Phillips, N. R., Tuck, O. T., Valentin-Alvarado, L. E. & Savage, D. F. Rubisco function, evolution, and engineering. Annu. Rev. Biochem. https://doi.org/10.1146/annurev-biochem-040320-101244 (2023).
Kędzior, M. et al. Resurrected Rubisco suggests uniform carbon isotope signatures over geologic time. Cell Rep. 39, 110726 (2022).
Wang, R. Z. et al. Carbon isotope fractionation by an ancestral rubisco suggests that biological proxies for CO2 through geologic time should be reevaluated. Proc. Natl Acad. Sci. USA 120, e2300466120 (2023)
Braakman, R. & Smith, E. The compositional and evolutionary logic of metabolism. Phys. Biol. 10, 011001 (2013).
Lane, N. Transformer: The Deep Chemistry of Life and Death (W. W. Norton & Company, 2022).
Shuman, R. F., Shearin, W. E. & Tull, R. J. Chemistry of hydrocyanic acid. 1. Formation and reactions of N-(aminomethylene)diaminomaleonitrile, a hydrocyanic acid pentamer and precursor to adenine. J. Org. Chem. 44, 4532–4536 (1979).
Roy, D., Najafian, K. & von Ragué Schleyer, P. Chemical evolution: the mechanism of the formation of adenine under prebiotic conditions. Proc. Natl Acad. Sci. USA 104, 17272–17277 (2007).
Sousa, F. L., Hordijk, W., Steel, M. & Martin, W. F. Autocatalytic sets in E. coli metabolism. J. Syst. Chem. 6, 4 (2015).
Nunes Palmeira, R., Colnaghi, M., Harrison, S. A., Pomiankowski, A. & Lane, N. The limits of metabolic heredity in protocells. Proc. Biol. Sci. 289, 20221469 (2022).
Harrison, S. A., Palmeira, R. N., Halpern, A. & Lane, N. A biophysical basis for the emergence of the genetic code in protocells. Biochim. Biophys. Acta, Bioenerg. 1863, 148597 (2022).
Noor, E., Haraldsdóttir, H. S., Milo, R. & Fleming, R. M. T. Consistent estimation of Gibbs energy using component contributions. PLoS Comput. Biol. 9, e1003098 (2013).
Beber, M. E. et al. eQuilibrator 3.0: a database solution for thermodynamic constant estimation. Nucleic Acids Res. https://doi.org/10.1093/nar/gkab1106 (2021).
Bertz, S. H. The first general index of molecular complexity. J. Am. Chem. Soc. 103, 3599–3601 (1981).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Cheng, H. et al. ECOD: an evolutionary classification of protein domains. PLoS Comput. Biol. 10, e1003926 (2014).
Cheng, H., Liao, Y., Schaeffer, R. D. & Grishin, N. V. Manual classification strategies in the ECOD database. Proteins 83, 1238–1251 (2015).
Mistry, J., Finn, R. D., Eddy, S. R., Bateman, A. & Punta, M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 41, e121 (2013).
Acknowledgements
We thank W. Fisher, A. Flamholz and J. Valentine for valuable discussions. J.E.G. is supported by the Gordon and Betty Moore Foundation as a Physics of Living Systems Fellow through grant number GBMF4513 and the Simons Foundation. J.E.G. and B.A.W. are supported by NASA’s Interdisciplinary Consortia for Astrobiology Research (grant number 80NSSC23K1357). S.E.M. and B.A.W. acknowledge support from NSF Awards 1724300 and 1724393 (Collaborative Research: Biochemical, Genetic, Metabolic, and Isotopic Constraints on an Ancient Thiobiosphere). In addition, S.E.M. acknowledges support from JSPS KAKENHI (grant numbers JP18H01325 and 22H01343). H.B.S acknowledges support from JSPS KAKENHI grant number JP19K23459.
Author information
Authors and Affiliations
Contributions
J.E.G., H.B.S., L.M.L. and S.E.M. designed the research. J.E.G., H.B.S. and L.M.L. prepared data. J.E.G. and H.B.S. wrote code. J.E.G. ran simulations and performed analyses. B.A.W. performed geological analyses. J.E.G., H.B.S., L.M.L., S.E.M. and B.A.W. wrote the paper. All authors read and approved the final paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Ecology & Evolution thanks Nick Lane, Eric Smith and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Network size and trajectory of metabolic expansion with hypothetical biochemistry.
We performed network expansion using the same seed set as shown in Fig. 1 in the main text, but added 20,183 additional reactions from the ATLAS database39,40. (a) The network sizes (number of compounds produced) for the expansion using KEGG only (blue) or KEGG plus the ATLAS database (purple) with either (i) no primitive coupling model, (ii) no primitive coupling, but added adenine to the seed set, or (iii) added primitive phosphate coupling. (b) The number of compounds produced at each iteration (y-axis) for the network composed of both KEGG and ATLAS reactions with primitive coupling (purple line) results in a punctuated structure similar to that of the KEGG reaction network with primitive coupling (blue line; see also Fig. 2b).
Extended Data Fig. 2 Scope sizes with alternative models of phosphoryl donor promiscuity.
(a) We constructed hypothetical de novo purine biosynthesis pathways with one of 6 potential phosphoryl donors in KEGG (PPi = pyrophosphate, Acetyl-P= acetyl phosphate, PEP=phosphoenolpyruvate, Glucose-6P= glucose-6-phosphate, Glycerol-3P=glycerol-3-phosphate, UTP=uridine triphosphate, CTP=cytidine triphosphate, and Creatine-P=creatine phosphate). The ensuing network sizes are plotted on the y-axis. Note that pyrimidine triphosphates did not enable expansion above 1000 metabolites. (b) A comparison between including primitive pyrophosphate-coupled purine synthesis versus substituting all ATP/GTP-coupled reactions with pyrophosphate alternatives (n = 523). A scatterplot for all compounds emerging in both expansions, where the x-axis denotes the iteration number for which a compound appears in the primitive purine biosynthesis model, and the y-axis corresponds to the iteration when the same compound emerges in the model with all ATP/GTP-coupled reactions have pyrophosphate alternatives. The positive correlation coefficient (Pearson’s r > 0.99, P < 2.2 ×10-16), denotes a strong similarity in expansion trajectories for both models.
Extended Data Fig. 3 Expansion with alternative nitrogen, sulfur, and carbon sources at various redox states changes scope and trajectory of metabolic evolution.
(a) We altered the nitrogen (x-axis) and sulfur sources (colour) of the seed set (listed Table S4) and re-ran the expansion. The size of the network (that is, number of compounds in the scope) is plotted on the y-axis. Only expansions with fully reduced nitrogen led to large networks, while several sulfur sources (sulfate, sulfite, thiosulfate, and sulfide) could lead to large networks. (b) We varied the composition of the organic molecules by randomly sampling sets of N = 21 compounds from a ranked distribution based on computed the degree of reduction (DOR). As described before (18,19), the DOR for molecule C is calculated using the following formula: y/x; xCO2 + yH2 = C + zH2O. We either sampled compounds in the bottom quartile of the distribution (lower, green dots) or from the top quartile (higher, orange dots). We found that expansion to >4000 compounds required seed compounds that had a low DOR, which consists of more oxidized carbon sources. (c) The average DOR for molecules produced at each iteration (y-axis) was plotted against the expansion iteration (x-axis); these variables exhibit a significant correlation (Pearson’s r = 0.89, P = 1.74 ×10-27). In subplot (c), the red line (shaded region) represents the line of best fit (95% confidence interval) (see Methods).
Extended Data Fig. 4 Molecular complexity and metabolic network expansion.
(a) The average Wiener topological index (y-axis) at each expansion iteration (x-axis). (b) Same as (a) but for the Randic topological index. In a-b, red lines (shaded regions) represent regression lines (95% confidence intervals) (see Methods). We computed Pearson correlation coefficients between expansion iteration and the Wiener index (r = 0.74, P = 1.18 ×10-18) and Randic index (r = 0.79, P = 1.60 ×10-22). (c) We sampled 21 organic molecules either at the top (blue dots, “complex”) or bottom (red dots, “simple”) quartile of molecules in terms of Bertz molecular complexity and re-ran network expansion. Only expansion with low complexity seed sets (red dots) led to expansions >4000 compounds. (d) For every compound we could compute a Bertz complexity metric for (n = 3543), we added that compound into the original seed set and re-ran network expansion. We next quantified the number of peaks in the expansion trajectory using the find_peaks function in the scipy.signal module with parameters Height=0, distance=5, and prominence=30. We found that the number of peaks was inversely correlated with the Bertz complexity of the additional seed molecule (Pearson’s r = -0.35 with P = \(1.36\times 1{0}^{-104}\)). Violinplot details for subplot d are as follows. Number of peaks: 4 (n = 123): Min=17.61, Max=3452.12, Median=1900.32, Q1 = 1779.66, Q3 = 2000.49; Number of peaks: 5 (n = 191): Min=0.00, Max=2676.11, Median=788.27, Q1 = 393.99, Q3 = 1624.70; Number of peaks: 6 (n = 3255): Min=0.00, Max=7606.85, Median=349.22, Q1 = 184.66, Q3 = 763.19; Number of peaks: 7 (n = 9): Min=392.11, Max=2534.64, Median=568.37, Q1 = 396.01, Q3 = 572.37.
Extended Data Fig. 5 Microbial genomes encode pathways that use compounds in expansion scope at higher rates than compounds outside of expansion scope.
We took 1,321 KEGG compounds that are intermediates in 377 KEGG modules, computed the proportion of prokaryotic genomes that use each compound as an intermediate in at least one KEGG module (y-axis), and grouped each compound based on whether the compound was observed in the expansion scope or not (x-axis). We plotted the distributions as violin plots and found that compounds observed in the expansion scope (n = 1,024) were much more prevalent in microbial genomes than compounds not observed in the expansion scope (n = 288) (Mann-Whitney U test: P = 7.6 ×10-64). Violinplot details: Compounds not observed in expansion (n = 295): Min=0.00, Max=0.94, Median=0.03, Q1 = 0.00, Q3 = 0.13; Compounds observed in expansion (n = 1026): Min=0.00, Max=1.00, Median=0.66, Q1 = 0.09, Q3 = 0.95.
Extended Data Fig. 6 Utilization and dependence of eukaryote-specific reactions during expansion.
(a) We identified 1638 reactions that were uniquely mapped to eukaryotic genomes, and plotted the proportion of reactions added during each expansion iteration that were eukaryote-specific (y-axis). We find a weak positive correlation between the relative proportion of eukaryote-specific reactions with expansion iteration (Pearson’s r = 0.32, P = 1.1 ×10-3). (b) We removed these reactions, performed the expansion, and found that the expansion scope consisted of 3806 compounds (88% of the reported expansion scope in the main text), with similar punctate structure. We found that removal of these 1638 reactions led to a particularly large network compared to randomly removed reaction sets: we sampled 103 sets of reactions (all of size 1638) to remove and re-ran network expansion. For all randomly chosen sets, all networks were smaller than the eukaryotic depleted network (inset; the orange line denotes the scope size of the network after removing Eukaryote-specific reactions).
Extended Data Fig. 7 Sensitivity of carbon fixation pathway timing to seed set variations.
(a) We randomly sampled 10 additional molecules from the set of KEGG compounds with less than 6 carbons (m = 1805), repeated the expansion, computed the iterations when each pathway became feasible (see Fig. 4c,d in the main text), and repeated the process 104 times. (a) A pairwise scatterplot between the iteration of completion for each carbon fixation pathway, where the dotted line is the line of unity. The red box highlights the comparison between the Calvin cycle and the reductive citric acid (rTCA) cycle, showing that a subset of conditions led to the Calvin cycle and rTCA emerging at the same iteration. In contrast, the purple box highlights the comparison between the Calvin cycle and the reductive acetyl-CoA pathway, showing that for all expansions, the Calvin cycle precedes the extant reductive acetyl-CoA pathway during the expansion. (b) A heatmap showing the pathway completion for an expansion with the original seed set, plus valine, which results in both the Calvin cycle and rTCA cycle emerging at the same iteration due to the accelerated production of Coenzyme A (CoA). (c) A volcano plot for each molecule randomly added as a seed set. For each metabolite, we computed the enrichment of that metabolite in the seed set for expansions that led to identical completion iterations between the Calvin and rTCA cycle (x-axis), and computed the statistical significance of this enrichment (y-axis) using a Fisher’s exact test. Metabolites that are over-represented in expansions where the Calvin cycle emerged at the same iteration as the rTCA cycle fall within the top right corner of the plot. Several of these metabolites are involved in either branched chain amino acid or pantothenate biosynthesis, which are both key precursors for Coenzyme A.
Supplementary information
Supplementary Information
Supplementary Methods and Discussion.
Supplementary Tables
Supplementary Tables 1–15.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Goldford, J.E., Smith, H.B., Longo, L.M. et al. Primitive purine biosynthesis connects ancient geochemistry to modern metabolism. Nat Ecol Evol 8, 999–1009 (2024). https://doi.org/10.1038/s41559-024-02361-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41559-024-02361-4
- Springer Nature Limited