Introduction

On the early Earth, in the absence of pre-existing templates, how might prebiotic polymers have emerged de novo? The template-free synthesis of polypeptides from amino acids dissolved in water must overcome the tendency for water to cleave rather than form peptide bonds. Many strategies, reviewed elsewhere (Kitadai and Maruyama 2017; Rode et al. 2007), employ dehydrating or activating agents including, for example, mineral surfaces (Lahav et al. 1978; Lambert 2008), dissolved Cu (II) and NaCl (Rode et al. 2007; Schwendinger and Rode 1989), imidazole (Ferris et al. 1996; Sawai et al. 1975), and polyphosphates (Fernandez-Garcia et al. 2017; Rabinowitz et al. 1969; Ying et al. 2018). Further conditions define a growing repertoire of mechanisms for de novo polypeptide formation (Forsythe et al. 2015; Gallego et al. 2015; Gibard et al. 2018; Rodriguez-Garcia et al. 2015; Sibilska et al. 2017), where characterization of products employ both HPLC and peptide sequencing by tandem mass spectrometry (Forsythe et al. 2017; Forsythe et al. 2015; Rodriguez-Garcia et al. 2015; Steen and Mann 2004). Commonly, de novo peptide synthesis employs aqueous solvent conditions, often combined with high temperatures (Huber and Wächtershäuser 1998; Imai et al. 1999; Rodriguez-Garcia et al. 2015), extremes of pH (Zamaraev et al. 1997; Sakata et al. 2010; Rodriguez-Garcia et al. 2015) and evaporation to form dry reactive residues (Lohrmann et al. 1975; Basiuk et al. 1990; Napier and Yin 2006). However, such studies are often carried out over narrow ranges of temperature and pH, with or without activating agents, providing an incomplete view of the potential prevalence of such condensation reactions in nature. Here, we address this limitation by surveying the de novo emergence of peptides formed by the condensation of glycine and alanine from aqueous solutions across 132 conditions, spanning twelve levels of acidity (pH 1-to-12) and eleven temperatures of water (0-to-100 °C).

All life on Earth relies on protein synthesis, a complex and highly evolved biochemical process that depends on the activation of amino acids by adenosine triphosphate (ATP) (Lane 2015), a versatile energy-rich molecule. ATP has plausible prebiotic chemical origins (Patel et al. 2015), and it can activate template-free condensation reactions of amino acids to form peptides (Rishpon et al. 1982; Weber et al. 1977), but ATP is larger and more chemically complex than amino acids. Simpler inorganic polyphosphates, such as trimetaphosphate (TP), can likewise promote condensation reactions of amino acids (Gao et al. 2011; Hulshof and Ponnamperuma 1976; Rabinowitz et al. 1969; Sibilska et al. 2017; Ying et al. 2018). Moreover, TP can be produced by volcanic processes (Yamagata et al. 1991) or by other prebiotic mechanisms (Osterberg and Orgel 1972; Pasek 2008). As a plausible inorganic precursor to ATP, we included TP in our systematic survey of de novo amino acid condensation reactions across temperature and pH. An overview of experiments is provided in Fig. 1.

Fig. 1
figure 1

Overview of experiments. (a) Structures of starting amino acids (glycine, L-alanine) and trimetaphosphate (TP), (b) Structures of reaction products. Amino acids and peptides are shown at their isoelectric points, (c) Summary of the experimental protocol. Glycine and alanine were dissolved in water, the solution pH was set, and mixtures were incubated, open to the atmosphere, at different temperatures for 24 h, which in most cases promoted evaporation and left a reactive dry solid-phase. Samples were then re-dissolved in water and analyzed by high performance liquid chromatography (HPLC) as well as mass spectrometry (MS). (d) HPLC chromatograms of an exemplary reaction mixture, starting amino acids and identified peptide standards. (e) MS/MS spectrum of species having the same composition (G2A) but different sequences exhibit different mass fragmentation patterns with distinct sequence-specific product ions, which enable characterization of peptide sequences

Material and Methods

Materials: All chemicals were of analytical grade purity and used without further purification. Glycine, sodium hydroxide, potassium phosphate monobasic and hydrochloric acid (12 N) were purchased from Thermo Fisher Scientific (Waltham, MA, USA). Hexanesulfonic acid sodium salt, trisodium trimetaphosphate, phosphoric acid, alanine, diglycine, alanylglicine, glycylalanine, dialanine and triglycine were supplied by Sigma Aldrich Co. (St. Louis, MO, USA). Tetra-, penta- and hexaglycine were supplied by Bachem (Torrance, CA, USA). Tripeptides: AAA, AAG, AGA, GAA, GGA, GAG, AGG were purchased from Biomatic (Wilmington, DE, USA).

Experiment setup: Reactions were carried out in 1.5 mL low-retention Eppendorf tubes. A standard modular heater with temperature control was used as a heat source (VWR, Randor, PA, USA). Stock solutions were: glycine (1 M aq. sol.), alanine (1 M aq. sol.), TP (0.5 M aq. sol.), NaOH (1 M aq. sol., 0.1 M aq. sol.), HCl (12 M aq. sol., 3 M aq. sol., 0.3 M aq. sol.). All stock solutions were prepared fresh every 3–4 days using ddH2O and stored at 4 °C in tightly sealed tubes.

General procedure for drying-induced condensation of alanine and glycine in the absence of trimetaphosphate: Reaction mixture of alanine (2.225 mg, 25 μmol, 25 μL, 1 M aq. sol.) and glycine (1.875 mg; 25 μmol, 25 μL, 1 M aq. sol.) was either alkalized with NaOH aq. sol. or acidified with HCl aq. sol. (when needed, dd water was added to reach a final volume of 200 μL) and kept under evaporating conditions (uncapped) in low-retention Eppendorf tubes at temperatures ranging from 0 to 100 °C for 24 h. Following incubation, the dried pellet was dissolved in 1 ml of ddH2O and subjected to HPLC analysis. For lower temperature experiments, where bulk water was still present after 24 h, the remaining solution was supplemented to a final vol. of 1 ml with ddH2O.

General procedure for drying-induced condensation of alanine and glycine in the presence of trimetaphosphate: Reaction mixture of alanine (2.22 mg, 25 μmol, 25 μL, 1 M aq. sol.), glycine (1.87 mg; 25 μmol, 25 μL, 1 M aq. sol.) and trisodium trimetaphosphate (4.6 mg; 15 μmol, 30 μL, 0.5 M aq. sol.) was either alkalized with NaOH aq. sol. or acidified with HCl aq. sol. (if necessary, dd water was added to reach final volume of 200 μL) and kept under evaporating conditions (uncapped) in low-retention Eppendorf tubes at temperatures ranging from 0 to 100 °C for 24 h. Following incubation, the dried pellet was dissolved in 1 ml of ddH2O and subjected to HPLC analysis. For lower temperature experiments, where bulk water was present after 24 h, the remaining solution was supplemented to a final vol. of 1 ml with ddH2O.

Reproducibility experiment setup: A second batch of experiments was performed independently from the first with new stock solutions and using a different heating block. For the purpose of this experiment we randomly picked conditions both for temperature and acidity (60 °C, pH 9). Samples from different batches were analyzed by the same method and compared.

IP- HPLC analysis and product identification: Samples from experiments were analyzed using a Shimadzu Nexera XR IP-HPLC system fitted with a reversed-phase C18 column (Phenomenex Aeris XB-C18, 150 mm × 4.6 mm, 3.6 μL, Phenomenex Torrance, CA, USA). Samples were auto-injected in 10 μL aliquots (Shimadzu Nexera X2 Autosampler, Schimadzu Nakagyo-ku, Kyoto, Japan), and analyzed in isocratic mode with a flow rate of 1 mL/min. The mobile phase consisted of 50 mM KH2PO4 and 7.5 mM of C6H13SO3Na solution adjusted to pH 2.1 with H3PO4. The instrument was controlled and the resulting data analyzed using Lab Solutions Software. Oligomeric products were detected at 195 nm, and their HPLC retention times were confirmed by comparison with pre-made or commercially available standards containing both monomers (alanine and glycine) and oligomers.

Mass spectrometry: High resolution mass spectrometry analyses data were collected on a MALDI-LTQ-Orbitrap XL (MALDI, matrix-assisted laser desorption/ionization) mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) with positive ion mode. The 1 μL of 2,5-dihydroxybenzoic acid (DHB) matrix solution (150 mg/mL in volume ratio of 50:50:0.1 = MeOH:H2O:FA) was mixed with 1 μL of analyte, then spotted on a MALDI plate. Full MS was acquired at m/z 100–1000 with a mass resolution of 30,000 (at m/z 400). MS/MS analyses were performed using Higher-energy C-trap dissociation (HCD). Normalized collisional energy of 35 was used for each fragmentation.

Results

As one might expect, shorter peptides dominated the distribution of products. In the absence of TP, the most prevalent reaction product was diglycine (GG), as shown in Fig. 2a, detectable at or above 40 °C, conditions where bulk water fully evaporated within 24 h. Concentrations of GG exhibited two ‘hot spots’ of higher yields occurring at pH 2–3 and pH 9–10 above 70 °C (Fig. 2b). The enhancement of polymerization under acidic conditions results from ionic interactions of substrates and evaporation processes, while higher peptide bond formation under alkaline conditions can be attributed to increased amine nucleophilicity (Sibilska et al. 2017). Under conditions of neutral pH neither acid- nor alkaline-enhanced mechanisms are available, so condensation of amino acids proceeds at a very low rate. For near neutral pH 5–6, the absence of side groups and steric interactions enabled GG formation (Fig. 2a), but not the other dipeptides, AA, AG or GA.

Fig. 2
figure 2

De novo synthesized oligopeptides of glycine and alanine depend on pH, temperature and TP activator. (a) Detected distribution of each species from 0-to-100 °C and pH 1-to-12 in the absence (black square) and presence (blue fill) of TP activator. Concentrations of each species are shown in the (b) absence or (c) presence of activator, where different shades of blue reflect solution phase concentration on a log-scale with maximum value of log (3) or 103 μM. Initial solution concentrations of amino acids were 250 mM, and activator was 75 mM. Further analysis indicated reproducibility of product levels within 11% among samples tested in triplicate (Supplementary Materials, Fig. S1)

In the presence of TP, however, the prevalence of GG was significantly extended to include the full range of tested pH (1–12) when temperatures were at least 50 °C, and GG was detected at all temperatures (0–100 °C) for pH 8 and higher (Fig. 2a). This result is noteworthy because in the range of temperatures 0–30 °C, bulk water remained in each sample, indicating conditions where activation by TP could promote the condensation reaction. Moreover, in the presence of TP, the more alkaline hot spot appeared to shift from pH 9–10 (Fig. 2b) to pH 11–12 (Fig. 2c). In earlier work, diglycine was also the most prevalent dipeptide produced from diverse amino acid monomers during salt-induced peptide formation in the presence of Cu (II) and NaCl (Rode 1999). Like GG, alanylglycine (AG) also exhibited hot spots of productivity at pH 2–3 and pH 9–10 (Fig. 2c), and in the presence of TP it was produced over a broad range of temperatures (10–100 °C) for pH 8 and across all pH (1–12) for temperatures of at least 50 °C (Fig. 2c). Moreover, in the presence of TP the more alkaline hot spot for synthesis of AG also appeared to shift from pH 9–10 (Fig. 2b) to pH 11–12 (Fig. 2c). Although the levels of remaining dipeptides, glycylalanine (GA) and dialanine (AA), were not resolved by HPLC from each other, their combined distribution was sufficient to deduce that GA was less prevalent than AG, and AA was less prevalent than GG. Observed overall lower rates of AA formation, relative to GG formation, are consistent with results from density functional theory (Bhunia et al. 2016), suggesting that electronic structure calculations may be relevant to some fundamental aspects of the dried solid-phase reactions. Like AG, both GA and AA exhibited similar patterns of productivity in their dependence on temperature and pH. The final observed dipeptide was diketopiperazine (DKP), which may be formed by intramolecular condensation and ring-closing of diglycine. Like GG and the other dipeptides, DKP exhibited hot spots of productivity at pH 2–3 and pH 9–10 and higher temperatures, but overall yields were generally lower. Further, in the presence of TP, condensation reactions to form DKP could proceed in the presence of bulk water, down to 10 °C (Fig. 2a). However, the higher pH hot spot for higher yields observed for the other dipeptides was diffuse or non-existent (Fig. 2c). Both GG and DKP have been detected in stony meteorites (Shimoyama and Ogasawara 2002), providing a plausible link between these laboratory observations and reaction products in extraterrestrial environments.

For tripeptides of glycine and alanine we observed AGG and GGA, as shown in Fig. 2a, and previous work indicates GGG is likely present (Sibilska et al. 2017), but it was not resolved here due to the presence of alanine (A), a concentrated species with a similar HPLC retention time. Other tripeptides, including GAG, or those enriched in A, specifically GAA, AGA, AAG or AAA, were not detected. In the absence of TP, patterns for tripeptides were similar to those of dipeptides but less widespread, only detectable for temperatures above 70 °C. In the presence of TP, the higher pH hot spot was diffuse or non-existent, like that of DKP. At and below 30 °C, where bulk water was present, dipeptides formed in the presence of TP, but no tripeptides were detected. The lack of tripeptides is not surprising because one would need to successfully couple two bimolecular reactions, the first to form a dipeptide from two monomers, and the second to form a tripeptide from the dipeptide and monomer. In solution, amino acids activated by TP created all possible dipeptide species. But the resulting dipeptide products were not sufficiently concentrated to drive a further bimolecular reaction with available monomers or activated monomers to create detectable tripeptides. Above 30 °C, however, where evaporation depleted the bulk water, dipeptides and monomers were sufficiently concentrated at high pH to drive formation of the tripeptide AGG, and higher temperatures further enabled the formation of GGA.

Among possible tetrapeptide and pentapeptide products, only the homopolymers of glycine were detectable above 70 °C and pH 9 in the absence of TP, as shown in Fig. 2a. In the presence of TP, the hot spot for higher yields shifted from pH 9 to pH 3. In the alkaline environment activation of amino acids by TP occurs by a two-step mechanism (Rabinowitz 1970). First, nucleophilic attack of a free (deprotonated) amine moiety at the phosphorus results in a formation of an N-triphosphoramidate with subsequent formation of a five-membered cyclic mixed anhydride. Ring closing is possible by the attack of the carboxylate on the phosphorus which is coupled to displacement of pyrophosphate. The second step involves reaction of another monomer amine moiety with the activated carbon of the mixed anhydride, followed by the hydrolysis of the phosphoramidate, yielding a dipeptide. In contrast with the amino acid monomers, formation of a cyclic intermediate is not possible for longer oligopeptides. In such cases, activation of the C-terminus proceeds intermolecularly, where phosphate is transferred from another N-phosphoro-dipeptide, and upon formation of an active acyl compound, polymerization can progress further. At and below pH 7, when the nucleophilic property of the amine drastically drops due to its protonation, and the carboxylate remains deprotonated, attack of a phosphate proceeds directly on the carbonyl. As we move from higher to lower pH, peptide bond formation is driven more by the electrophilicity of phosphate than the nucleophilicity of the amine, and for pH 2–3 the process fully depends on electrophilic phosphate activation. Consequently, peptide bond formation in the presence of TP proceeds at lower pH and at lower temperatures owing to activation effects of the phosphate and mixed reaction mechanisms.

When bonds are formed between at least two different monomers, such as glycine (G) and alanine (A), there is potential to form peptides possessing different lengths and sequences, such as GG or AGG. We quantified the emergence of de novo peptides, under different reaction conditions, based on the number of unique species detected by HPLC. At least one peptide product was detected in 45% of the 132 tested combinations of pH and temperature (Fig. 3a). More peptides were detected at higher temperatures, with acidic (pH 2–3) and alkaline (pH 8–9) conditions yielding higher species counts at 100 °C. Below 40 °C, where bulk water remained in all samples, no peptides were detected.

Fig. 3
figure 3

Distribution and enhancement of peptide species and peptide bond production across temperature and pH. (a) In the absence of TP, peptide products are more diverse at higher temperatures, pH 2–3 and pH 8–9. (b) In the presence of TP, diverse peptide products are formed across a wide range of pH and at low temperatures in bulk water. (c) Overall enhancement of peptide species diversity by TP is greatest for pH 4–8. During the 24 h incubation period, samples at and below 30 °C remained wet, while samples at and above 40 °C became dry. (d) In the absence of TP, peptide bond formation was enriched at higher temperatures, pH 2–3 and pH 8–9. (e) In the presence of TP, peptide bond formation was enriched across a wide range of pH and at low temperatures in bulk water. (f) Enhancement of peptide bond formation by TP spanned a broad range of pH and at low temperatures in bulk water. Data are available (Supplementary Information)

Activation of condensation reactions by TP significantly enhanced peptide yields and enabled their formation in new environments. When TP was present, high species counts spanned a much broader swath of acidic-to-alkaline conditions (pH 2–12) and temperatures above 60 °C (Fig. 3b). Notably, for alkaline conditions (pH > 7) and temperatures below 40 °C, dipeptide products were detected in the presence of bulk water, environments which yielded no detectable peptides in the absence of TP activation. The absence of tri- or higher peptides under bulk water conditions highlights the prevalence of single-hit reactions, where two amino acids condensed to form a dipeptide, but no dipeptides then participated in reactions to form still longer products. Further, the presence of tri- and higher peptides under dry conditions highlights the feasibility and prevalence of such multi-step reactions within the dry solid residues.

To see how TP at each condition contributed to enhancement of product diversity, differences between species counts in the presence and absence of TP were tabulated (Fig. 3c). Here, the net enhancement mediated by TP showed multiple peptide products formed in the presence of bulk water for alkaline conditions below 40 °C. Above 50 °C, species counts were enhanced closer to neutral conditions (pH 4–8) than the alkaline and acidic conditions favored in the absence of TP.

Characterization of the products on a basis of peptide bonds formed, rather than counting the number of species formed, showed similar patterns in the absence of TP. Regions of high peptide bond formation (Fig. 3d) correlated with observed regions of high species diversity (Fig. 3a). Further, in the presence of TP, a majority of the tested aqueous conditions (79%) yielded peptides above their limit of quantification (~ 0.5 μM), as shown in Fig. 3E. Efficient formation of peptides across the whole range of tested pH may be attributed to kinetic stability of the phosphate and its three levels of ionization (pKas of 2.2, 7.2, 12.3), which can promote more diverse chemistries for peptide bond formation (Fernandez-Garcia et al. 2017). Moreover, the free energy associated with the hydrolysis of TP is 21 kcal/mol (Kura et al. 1987; Meyerhof et al. 1953), or 7 kcal/mol for the cleavage of one P-O-P bond, almost double the ~3.6 kcal/mol needed to overcome the barrier for peptide bond formation (Borsook 1953). Thus, it is not surprising that TP contributed to higher peptide bond yields and enabled their formation over a broader range of temperature and pH conditions. The enhancement of peptide bond formation provided by TP, quantified by taking the differences (levels in Fig. 3E minus levels in Fig. 3d) to yield Fig. 3F, appears quite similar to Fig. 3E, indicating the very high enhancement of peptide bond yields associated with activation by TP.

Discussion

For highly evolved proteins in nature, the smallest amino acids, glycine and alanine, typically play discreet roles, providing local flexibility and contributing minimal steric constraints on structure. Relative to the other naturally-occurring amino acids, glycine and alanine are seldom cited for their central roles in protein structures or functions. However, here in our minimal prebiotic model for peptide emergence, subtle differences in the reactivity of glycine and alanine were made detectable by setting their equimolar solutions to different pH and different temperatures (producing in most cases a dry residue) and analyzing the resulting rehydrated products. Key differences were the formation of a cyclic dipeptide, diketopiperizine, from glycine but not from alanine, tripeptides that were enriched in glycine but not alanine, and homopolymers of glycine (G4 and G5) but not of alanine. Longer products with more diverse compositions and sequences may well arise under longer incubation periods.

The simple recipe that we employed to make peptides from amino acids belies the generation of a reaction environment of significant complexity, especially under conditions where bulk solvent evaporates. In the resulting dry residue, chemical species concentrations can become large, products of one set of reactions can become starting materials for another set, creating conditions ripe for the emergence of interaction networks that were anticipated by theories on the chemical origins of life (Eigen 1971; Gánti 2003; Kauffman 1993). In addition, diffusion coefficients of reactants and products (including liberated water) can be many orders of magnitude lower in the solid dry phase than in the liquid phase (Bird et al. 1960), so the spatial heterogeneities that arise may also diversify the local reaction environments and emergent peptide sequences. Further, in such dry environments, reactions can be productive over hundreds of hours (Napier and Yin 2006), and simple amino acids and peptides may take on roles beyond reactants or products; they may potentially catalyze further reactions (Fitz et al. 2008; Plankensteiner et al. 2002). If such dry environments are rehydrated in the presence of activating agents such as TP, then condensation as well as hydrolysis reactions that cleave peptides may be enhanced (Sibilska et al. 2017), and delivery of fresh monomer building blocks can replenish depleted pools, all conditions that are primed to promote higher levels of molecular interaction and organization (Kauffman 1986; Pascal et al. 2013). The stage is now set to explore how different starting amino acids impact the length, sequences and potential functions of de novo peptide populations.