Introduction

The difficulty of forming RNA prebiotically has long been one of the greatest stumbling blocks for theories of the origin of life. Spark discharge experiments (simulating lightning) have been used to generate amino acids (Miller 1953; Cleaves et al. 2008), RNA precursor species like peptide nucleic acid (PNA) (Nelson et al. 2000), some nucleobases, in conjunction with eutectic freezing (Menor-Salván et al. 2009), and have been conjectured to form sugars (Schlesinger and Miller 1983), while forming ribose prebiotically is challenging (Shapiro 1984, 1988). Furthermore, hooking ribose together with pyrimidine nucleobases has long been considered difficult or impossible (Szostak 2009). Recently, Powner et al. (2009) discovered a way around this last problem by way of a novel multi-step ribonucleotide synthesis mechanism that bypasses the need for free sugars and nucleobases. As starting materials, their mechanism requires cyanamide (CN2H2), cyanoacetylene (C3HN), glycoaldehyde (HOCH2CHO), glyceraldehyde (C3H6O3), and inorganic phosphate. These compounds could conceivably have been brought in by comets or asteroids (Chyba et al. 1989, 1990) or micrometeorites (Maurette et al. 2000), or they could have been formed in situ on the early Earth. Here, we estimate the rate of formation of these compounds on the early Earth, focusing in particular on the key compound glycoaldehyde.

The key to forming any of the first four compounds in the atmosphere is to have an environment in which methane can polymerize to form higher hydrocarbons. Both photochemical models (Haqq-Misra et al. 2008; Domagal-Goldman et al. 2011) and laboratory experiments (Trainer et al. 2006) predict that methane should polymerize if the CH4:CO2 ratio exceeds ~0.1. Our current model focuses on a lower ratio of 0.02, where peak production for GA and a modest amount of polymerization occurs. CO2 levels in the early atmosphere are largely unconstrained, with various authors predicting either very high values (Walker 1985) or very low ones (Sleep and Zahnle 2001). Paleosols suggest that CO2 concentrations were 10–50 times present (0.004–0.02 bar) at 2.7 Ga (Driese et al. 2011), but CO2 concentrations prior to this time could have been much higher in order to keep the early Earth warm. CO2 partial pressures exceeding ~0.03 bar produce enough greenhouse warming to resolve the faint young Sun problem, particularly when supplemented by CH4 (Haqq-Misra et al. 2008). Other methods to solve the faint young Sun problem include alternative models for solar evolution (Gaidos et al. 2000), novel greenhouse gases such as ammonia (Sagan and Chyba 1997) and carbonyl sulfide (OCS) (Ueno et al. 2009), and changes in cloud type and surface albedo (Rosing et al. 2010); these arguments will not be tested in this work. Here, the model calculations are performed at a lower CO2 partial pressure, 0.005 bar, to best remain in the regime of possible prebiotic methane concentrations and still explore a broad range of CH4:CO2 ratios (see the Model Description section for more details).

CH4 should have been easy to come by in the biotic era if methanogens evolved early (Kharecha et al. 2005), but whether it would have been present in high concentrations in the prebiotic era is uncertain (Emmanuel and Ague 2007; Lazar et al. 2012; Fegley and Schaefer 2012). For a 1-bar N2-dominated atmosphere, a CO2 partial pressure of 0.005 bar, and a CH4:CO2 ratio of 0.02, the required volume mixing ratio of CH4 to induce polymerization would be ~100 ppmv (~1×10 − 4 bar). This is large but not implausible. (See further discussion below.) So, we will henceforth assume that a thin organic haze was present in the prebiotic atmosphere.

Glycoaldehyde is formed in today’s atmosphere by oxidation of ethene (C2H4) and isoprene (C5H8) (Bacher et al. 2001; Magneron et al. 2005; Karunanandan et al. 2007). Ethene is a predicted gasous component of a hazy Archean atmosphere (Pavlov and Kasting 2001; Domagal-Goldman et al. 2011). Formation of glycoaldehyde from ethene is discussed in the next section. Glycoaldehyde can also be formed by way of the formose reaction, which involves polymerization of formaldehyde (H2CO) in an aqueous solution (Butlerov 1861; Breslow 1959; Orgel 2000). Formaldehyde is formed in copious quantities in both clear and hazy low-O2 atmospheres (Pinto et al. 1980; Pavlov and Kasting 2001). So, this may represent a more efficient way of generating glycoaldehyde than direct atmospheric chemistry. The formose reaction is not without its drawbacks, however; it would also generate a number of other sugars, which may pose problems for prebiotic synthesis (Schwartz 2007). Here, we compare production rates of glycoaldehyde from both mechanisms to see which, if either, might have provided a useful source of this important precursor molecule.

Glycolaldehyde Chemistry

Two hypothetical formation routes have been proposed to generate GA photochemically, with one route for the modern atmosphere, and one route, proposed by the authors, for conditions which may have prevailed on the early Earth. Bacher et al. (2001) suggested the following production route:

$$ {\mathrm{C}_2\mathrm{H}_4 + \mathrm{OH} \mathop{\longrightarrow}\limits_{~}^{\mathrm{+M}} \mathrm{HOC}{_2}\mathrm{H}_{4}^{*}} $$
(1)
$$ {\mathrm{HOC}_2\mathrm{H}_{4}^{*} + \mathrm{O}_2 \mathop{\longrightarrow}\limits_{~}^{\mathrm{+M}} \mathrm{HOC}_2\mathrm{H}_4\mathrm{O}_2^{*}} $$
(2)
$${\mathrm{HOC}_2\mathrm{H}_4\mathrm{O}_{2}^{*} + \mathrm{NO} \longrightarrow \mathrm{HOC}_2\mathrm{H}_4\mathrm{O}^{*} + \mathrm{NO}_2 } $$
(3)
$${\mathrm{HOC}_2\mathrm{H}_4\mathrm{O}^* + \mathrm{O}_2 \longrightarrow \mathrm{HOCH}_2\mathrm{CHO} + \mathrm{HO}_2 } $$
(4)

In these reactions, M represents an arbitrary unreacting third molecule that carries off excess kinetic energy. This chain of reactions, which is considered to be the dominant production route under present-day atmospheric conditions, is highly dependent on atmospheric oxygen concentrations. In the type of weakly reducing atmosphere thought to have existed early in Earth’s history (Kasting 1993; Hashimoto et al. 2007; Tian et al. 2005), a more likely chain of reactions, which has been proposed by the authors, would be:

$$ {\mathrm{C}_2\mathrm{H}_4 + \mathrm{OH} \mathop{\longrightarrow}\limits_{~}^{\mathrm{+M}} \mathrm{HOC}_2\mathrm{H}_4^*} $$
(1)
$$ {\mathrm{HOC}_2\mathrm{H}_4^* + \mathrm{OH} \mathop{\longrightarrow}\limits_{~}^{\mathrm{+M}} \mathrm{HOCH}_2\mathrm{CH}_2\mathrm{OH}} $$
(5)
$$ \mathrm{HOCH_2CH_2OH + OH \longrightarrow HOCH_2C^*HOH + H_2O }$$
(6)
$$ \mathrm{HOCH_2C^*HOH + OH \longrightarrow HOCH_2CHO + H_2O }$$
(7)

Both chains begin with the hydroxyl radical adding to ethene and breaking its double bond (step 1). After that, the low-oxygen route is driven by hydroxyl interactions, whereas the Bacher et al. mechanism requires molecular oxygen at steps 2 and 4. In addition to these two routes, there is also a secondary reaction that can produce glycolaldehyde from one of the intermediate species generated in reaction 2:

$$ \mathrm{HOC_2H_4O_2^* + HOC_2H_4O_2^* \longrightarrow HOC_2H_4OH + HOCH_2CHO + O_2 }$$
(8)

A streamlined reaction series can be seen in Fig. 1, which contains all the reactions leading from ethene to GA. In each set of reactions, several steps lack well-constrained reaction rates, so rates from similar reactions (based on the similarity between reactants) have been used here as a proxy (see Appendix B). An analysis of estimated errors in reaction rates can be found in the Discussion. Lastly, it should be noted that the outlined formation mechanisms are not the only possible routes to producing GA, but represent the most direct synthesis from smaller molecules (a bottom-up approach). In solution, for example, metals have been proposed to catalyze the conversion of ethylene glycol (HOCH2CH2OH) to GA (Eisch et al. 2004), and photolysis of longer hydrocarbons present in either the atmosphere or the surface ocean could generate GA. Aside from the consideration of the formose reaction in later sections, no aqueous reaction systems are included, and contributions from the photolysis of longer chain hydrocarbons are expected to be small.`

Fig. 1
figure 1

Simplified series of reactions resulting in the formation of glycolaldehyde from ethene

Once GA is produced in the atmosphere, it can be transported to the surface in one of two ways: either by being incorporated into particles or by dissolving in rainwater. To simulate the first process, a new reaction was added to the photochemical model, in addition to the ones responsible for chemical production and loss. This new reaction represents the adsorption of GA onto a haze particle, based on the principles of collision theory, which we will call “sticking”. With this reaction in place, the overwhelming majority of GA is drawn out of the atmosphere and sticks to the haze particles, which effectively transports the entire column production of GA to the surface ocean or any existing continental area. Second, since GA is highly soluble, with a Henrys Law coefficient of 4 × 104 M atm − 1 (Betterton and Hoffmann 1988). This means that once GA is produced in the atmosphere, it can be transported very efficiently to the surface by rain-out, and it would be protected from photolysis while in its hydrated form (Bacher et al. 2001). Surface deposition is neglected for GA, since this contribution would be very small. These two processes individually, and in combination, control the flux of GA to the surface; the consequences of this transport will be described in further detail in the Discussion.

Model Description

We used a 1-dimensional (horizontally averaged) photochemical model to study this problem. The photochemical model is modified from the one utilized by Domagal-Goldman et al. (2011) (which was itself adapted from Kasting et al. (1979), by way of Pavlov and Kasting (2001)). It contains 88 chemical species Interacting via 443 reactions, which can be found in Appendices A and B, respectively. Table 1 contains the list of long-lived species, while Table 2 has the short-lived (intermediate) species. The mixing ratios of both CO2 and N2 are held constant with altitude. The model simulations shown here all have the same amount of CO2 (5,000 ppmv, ~0.005 bar, or about 14 times the present atmospheric limit, or PAL) in a 1 bar N2-dominated atmosphere. This amount of CO2 was chosen as a reasonable value from a climate standpoint, based on the wide range of geological evidence and the accompanying theories (see Feulner (2012), Section 5.3 for an exhaustive review), and because of the constraints on expected abiotic methane concentrations, which will be discussed later. Methane concentrations were varied from 5 ppm to 1,500 ppm to test the system over a range of hydrocarbon haze abundance. These concentrations correspond to CH4:CO2 ratios from 0.001 (less than a quarter of today’s ratio) to 0.3 (where polymerization of methane dominates the atmospheric chemistry). Atmospheric chemistry and transport were computed by solving a set of coupled partial differential equations, which were converted to ordinary differential equations by centered finite differencing and then integrated to steady state using the reverse Euler method. See Pavlov and Kasting (2001) for a more complete description of the model.

The model also includes UV absorption cross-sections for GA, most recently measured by Karunanandan et al. (2007), to calculate losses from photolysis. In addition, the model includes a hydrocarbon haze, along with two other types of aerosols, S8 and H2SO4. Vertical number density profiles for all three aerosols are computed using a steady-state, tridiagonal solver, which is run at each step of the time-stepping loop. The hydrocarbon haze in this model is calculated using the fractal haze properties of Wolf and Toon (2010). The key differences between spherical and fractal hazes are described by those authors. For more information about the implementation of the fractal haze algorithm, see Supplemental Materials.

Results

Figure 2 shows the total column production of both formaldehyde (H2CO, or FA) and GA for the full range of CH4:CO2 ratios. As seen in Fig. 2, the strongest column-integrated production of GA occurs at a CH4:CO2 ratio of 0.02 (total column production for GA is ~7 × 105 cm − 2s − 1), which is used for Figs. 3, 4 and 5. At this CH4:CO2 ratio, the amount of methane and hydrogen present in the atmosphere allows for a greater amount of molecular oxygen, which is integral to the Bacher et al. production route for GA. Figure 3 shows the main atmospheric gas species mixing ratios with height on the left, as well as the dominant long-chain hydrocarbons on the right. Note that the peak in hydrocarbons and water that occurs where the O2 concentrations are reduced near 50 km. Figure 4 shows the various number densities of certain subsets of species: panel (a) shows H- and O-related radicals; panel (b) shows the various radicals produced from the photolysis of methane; panel (c) shows formaldehyde, as well as several radical precursors to species found in panel (d); and finally, panel (d) shows the immediate precursors to GA. Figure 5 has two parts: part (a) shows the vertical column production rates for several different CH4:CO2 ratios, and part (b) describes the dominant production and loss rates with height. Interestingly, in Fig. 5b, it can be seen that the principal production of GA is not through either route, but rather from the reaction 8. This reaction is much faster than the proposed anoxic production route and minimizes the amount of O2 necessary to produce GA, thus offering a compromise between the anoxic and oxic production routes. The intermediate HOC2H4O* very rapidly decays, preventing the set of reactions (reactions 14) proposed by Bacher et al. (2001) from going to completion. This results in the production via reaction 4 being nearly 40 orders of magnitude lower than via reaction 8.

Fig. 2
figure 2

Total production of both formaldehyde (dashed line) and glycolaldehyde (solid line) for a range of CH4:CO2 ratios. Note that the formaldehyde values are a scaled down by 106 (e.g., production of FA at 0.17 CH4:CO2 is ~3 × 1011 cm − 2s − 1)

Fig. 3
figure 3

Long-lived species mixing ratios (a), and prevalent longer-chain ordinary hydrocarbons (b)

Fig. 4
figure 4

Prevalent H and O radicals (a), methyl radicals (b), C2H4OH, C2H4O, and HCO radicals, alongside formaldehyde (c), and important GA precursor molecules (d)

Fig. 5
figure 5

Total production for GA, with varying CH4:CO2 ratios on the left. Note that the number densities have been reduced by a factor of a thousand. Key reactions for the production and loss of GA on the right. Reactions (7) and (8) are the production routes in the text from HOC2H3OH and HOC2H4O2, respectively, while reactions L1 and L2 are destruction by OH via H-abstraction

The organic haze produced at 2 % CH4:CO2 is very thin (the total column optical depth at 500 nm is ~7 × 10 − 8), and sensitivity studies show that the haze remains thin in the visible for a much broader range of CH4:CO2 ratios than spherical haze under comparable conditions. The critical amount of polymerization for an optically thick haze is pushed from a CH4:CO2 ratio of 0.2 to closer to 0.3, due to the fractal nature of the haze. For more results regarding the haze, see the Supplemental Materials.

Discussion

Prebiotic CH\(_{\mathit{4}}\) Concentrations

We first return to the question of how much CH4 could have been present in the prebiotic atmosphere. Our base model assumes 100 ppmv CH4. According to Pavlov and Kasting (2001), maintaining this atmospheric concentration would require a CH4 source of ~1010 molecules cm − 2s − 1, or ~5×1012 moles yr − 1, which is one-tenth of the present biological CH4 flux (Prather 2001) (1 mole yr − 1 = 0.00374 molecules cm − 2s − 1). The current abiotic flux of CH4 from midocean ridges could be as high as 1012 mol yr − 1, based on measured CH4 concentration of 1–2 mmol/kg at the Lost City ventfield on the Mid-Atlantic ridge (Kelley et al. 2005) and a hydrothermal circulation rate of ~1015 kg/yr, estimated from heat flux measurements (Mottl and Wheat 1994). A more conservative estimate, based on the assumption that about 15 % of seafloor oxidation can be attributed to serpentinization, is that the CH4 flux from this source is closer to 1011 mol yr − 1 (Sleep 2005). This CH4 is produced by serpentinization of peridotite, an ultramafic rock type, in the presence of dissolved CO2 (an ultramafic rock is one that is rich in magnesium, like the mantle). This CH4 source would need to be 5–50 times bigger on the early Earth in order to generate 100 ppmv of atmospheric CH4. Today, ultramafic rocks constitute only a small fraction of the oceanic crust that interacts with seawater. On the early Earth, because of its hotter mantle, much higher degrees of partial melting may have occurred during magma generation, creating thick oceanic crust that should also have been ultramafic (Moores 1986, 1993, 2002; Sleep 2007). Herzberg et al. (2010) preditcted quantitatively that Archean seafloor should have contained 18–24 % MgO, as compared to 10–13 % MgO in Phanerozoic seafloor. Thus, high prebiotic atmospheric CH4 mixing ratios are speculative, but not implausible. If prebiotic CH4 concentrations were lower or higher than assumed here, then atmospheric production of GA would have been lower, assuming 5,000 ppm of CO2. The primary control on GA production in the atmosphere is the CH4:CO2 ratio, with a weaker dependence on the total abundance of CH4.

Uncertainties in Atmospheric Production of Glycolaldehyde

The rates associated with reactions 18, with the exception of reactions 5 and 7, have well-constrained errors. Following the published rate and error data for reactions 1 and 2, the total column production varies from the reported result by less than 25 % and 20 %, respectively. The error for reaction 6 changes total column production by less than 10 %, while errors associated with reaction 8, the principal production route in the model atmosphere, vary total column production by a factor of two. Estimated errors for reaction 5 produced no significant variation in total column production, which is most likely because reaction 8 supplies the bulk of the reactant necessary for reaction 6, HOCH2CH2OH. Reaction 7 estimated errors produced only small changes (~1 %), because production via reaction 8 is the dominant formation mechanism. Reaction 4 does not play a significant role in GA production.

Perhaps the largest source of uncertainty in our model of GA production is the efficiency with which it would have been delivered to Earth’s surface. We have effectively maximized this efficiency by assuming that every GA molecule that is adsorbed onto a haze particle is delivered intact to the surface. This optimistic assumption may not be correct. If instead the GA molecules reacted chemically with the compounds in the haze particles, then delivery of GA could have been much less efficient. The situation changes somewhat for GA that is produced in the troposphere (the lowest 10 km in our model). There, the haze particles would likely have acted as cloud condensation nuclei (CCN) (Engelhart et al. 2011; Twohy et al. 2005) and would have become coated with water. GA molecules that collided with these particles would have gone into solution, rather than directly attached to the surface of the haze particle, where they could become irreversibly attached to the haze particle. Some separation between the haze particle and the GA molecule greatly increases thelikelihood that GA would survive transportation through the troposphere. If we assume that only GA formed within the troposphere made it to the surface, then the rate of GA delivery would have been reduced by a factor of 2000 compared to our base model. From Fig. 2, the maximum production rate of GA is 7 × 105 cm − 2s − 1, or ~2 × 108 mol yr − 1. So, in the troposphere-only model, this rate would be ~1 × 105 mol yr − 1. Below, we use the higher value for GA production; this rate is marginally able to support prebiotic synthesis.

Predicted Glycolaldehyde Concentrations in the Ocean

How low the corresponding bulk ocean GA concentration would be depends on a number of factors, such as the efficiency and mode of transport (e.g., sticking or rainout), and whether there were aqueous production routes (e.g., the formose reaction), which could have generated GA from other molecules. The latter possibility is discussed in the next section.

Neglecting the formose reaction for now, we can calculate an upper limit on the dissolved GA concentration in the open ocean by balancing its production from photochemistry (from Fig. 2) with its presumed destruction when seawater passes through the midocean ridge hydrothermal vents. This neglects all other potential loss mechanisms, and thus is the most optimistic prediction for bulk ocean concentration, but a number of loss processes associated with FA (discussed in the next section) could also impact GA concentrations. As pointed out above, the maximum atmospheric production rate of GA is ~2 × 108 mol yr − 1. At present, the time required to cycle the entire volume of the oceans (1.4 × 1021 L) through the hot axial midocean ridge hydrothermal vents is ~107 yr. Thus, the rate at which GA would cycle through the ridges today would be equal to its concentration times 1.4 × 1021 L/107 yr (= 1.4 × 1014 L yr − 1). Setting production and loss equal yields

$$\begin{array}{lll} \text{C}_{\text{GA}} = \frac{\text{Rate in}}{\text{Rate out}} &= \frac{(\text{Rate}_{\text{R+S+s}})}{1.4\times10^{14} \text{ L/yr}}\\ &= \frac{\left(2 \times 10^{8} \text{ moles/yr}\right)}{ 1.4\times10^{14} \text{ L/yr}}\notag\\ &\approx 1 \times 10^{-6} ~\text{M} \notag \end{array} $$
(9)

The circulation rate through the vents may have been higher in the past (Isley 1995), and ocean volume could conceivably have been higher (Korenaga 2008), so the actual dissolved GA concentration could have been up to a factor of two or three lower or higher, respectively, even without considering other loss processes. By comparison, the concentrations of GA used in the laboratory experiments of Powner et al. (2009) were of the order of 1–2 M. So, as other authors have concluded for various other prebiotic compounds, filling the entire ocean with GA does not appear feasible, at least not by this mechanism (Schlesinger and Miller 1983). This does not necessarily indicate that the mechanism is implausible; however, it demonstrates that strong concentration processes, e.g., evaporating tide pools or eutectic freezing (Sanchez et al. 1966; Miyakawa et al. 2002), would be needed to produce GA concentrations compatible with Powner et al.’s synthesis mechanism. Regardless of the concentration mechanism, the coincident non-volatiles, such as salt, soluble metals, and more prevalent chemical species than GA, would be similarly concentrated, which may have an unanticipated effect on the concentration of GA or the synthesis mechanism outlined by Powner et al.

Production of Glycolaldehyde by Way of the Formose Reaction

A second pathway for producing GA is by way of the formose reaction (Butlerov 1861; Breslow 1959; Orgel 2000), which involves polymerization of FA into longer hydrocarbon chains. An example of the first few reactions can be seen in Fig. 6. By constrast with GA, FA should have been produced in copious quantities in the prebiotic atmosphere (Pinto et al. 1980). However, when GA production is maximized at 2 % CH4:CO2, FA production suffers. At 2 % CH4:CO2, the atmospheric production of FA is only 35 % of the production at 20 % (column-integrated production of 8.1 × 1010 versus 2.3 × 1011 [cm − 2 s − 1], respectively). Neglecting the focus on the atmospheric production of GA, we will briefly consider the higher FA production levels, consistent with a CH4:CO2 ratio of 0.2.

Fig. 6
figure 6

The first few steps of the formose reaction

For the case above, which maximizes FA production, the corresponding rate at which FA enters the oceans is 1.5 × 109 cm − 2 s − 1, or ~4 × 1011 mol yr − 1. Following the same logic as before (i.e., assuming removal of formaldehyde at the midocean ridges), the bulk ocean concentration of FA would be 2 mM. This high concentration is an upper limit, given that there are a number of other processes, such as reactions with sulfur species, amines, HCN, and mineral surfaces, as well as the effect of varying pHs or salinities, that might control dissolved concentrations of FA (Cleaves 2008; Allou et al. 2011). Given that millimolar concentrations of FA can be sufficient to support the formose reaction (Gabel and Ponnamperuma 1967), it is possible that the hydrothermal vent system could be used to generate GA from FA. The conversion of FA into more complex organic compounds is consistent with the results of Kopetzki and Antonietti (2011), who have demonstrated that FA is removed very rapidly at the vents. In their experiments at 100 bar and a variety of temperatures (60–200 °C), a 0.5 molar concentration of formaldehyde yielded a plethora of organic molecules. Applying the experimental yields from Kopetzki and Antonietti at the much lower concentrations of FA engenders some complications. If the rate of the formose reaction is proportional to the square of the concentration of FA, it is possible that the amount of GA produced could be as much as a factor of 106 lower than calculated below. However, the decomposition of the longer chain products may generate GA as an intermediate, which stabilizes GA yield on longer timescales than other products; furthermore, lower concentrations may limit the degree to which polymerization occurs, which would increase the yield of GA compared to other products. Assuming, then, that the same yields remain valid for lower concentrations of FA, GA production by the hydrothermal formose reaction would result in nearly 30 μM GA, even at the lowest experimental yields (0.8 %). This level of production translates to over 2 × 109 mol yr − 1 (~2 × 108 kg yr − 1), while maximum yields (1.41 %) would result in production on the order of 4 × 109 mol yr − 1 (~3 × 108 kg yr − 1). This is equivalent to 20 times the maximum atmospheric production of 2 × 108 mol yr − 1. This suggests that, even at lower CH4:CO2 ratios, the formose reaction would be the dominant production pathway for GA.

Alternatives and Enhancements to the Formose Reaction

The formose reaction comes with a number of caveats and potential pitfalls. Often, the concentrations of potential reactants are less than the threshold for the autocatalytic reaction, which can limit the scope of products from the formose reaction. Additionally, the veritable zoo of products from the unmediated (i.e., without the addition of some stabilizing agent) formose reaction tend to be unstable and unsuitable for further reactions, and are most often described as a “tar”. Shallow tidal pools on early continental areas have been proposed as a possible method for concentrating compounds such as GA, but given the lack of data concerning the distribution and scale of continents from the Hadean into the Archean, it is difficult to prove that such environments were widespread, or that they even existed (Korenaga 2008). Experiments on clay surfaces have been shown to be effective at removing water, which acts as another method for concentrating relevant organic compounds; Hazen and Sverjensky (2010) present a concise review of work in this direction. Stability can be aided by the addition of a mediating compound, which can limit the number of stable products, as well as increasing their longevity in solution. The number of proposed mediating materials has grown in recent years: they include pyrite mineral surfaces (Wächtershäuser 1992), borate (Kim et al. 2011), silicates (Lambert et al. 2010), and some other metal complexes, such as molybdate, vanadate, germanate, and aluminate (Schilde et al. 1994) (see Benner et al. (2010) for a fairly comprehensive overview). Better ways of producing GA in liquid solution may exist. Pestunova et al. (2005) (as well as later experiments (Delidovich et al. 2009, 2011)) have shown that GA can be produced by UV photolysis of FA in solution. This process happens in neutral to weakly alkaline solution, so it might conceivably have occurred within the surface ocean. It might also have occurred in raindrops (because FA would have dissolved in them), but this seems less likely because the raindrops should have been acidic due to the presence of high concentrations of atmospheric CO2.

More recently, experiments like those of Powner et al. (2009) have made progress in exploring alternatives to the formose reaction, with the introduction of phosphate into the synthesis cycle stabilizing the end product, β-ribocytidine-2’,3’-cyclic phosphate. The experiments of Ritson and Sutherland (2012) have shown that derivatives of GA and GCA (glyceraldehyde) can be formed photolytically in liquid solution from HCN and H2CO in the presence of various cyanometallates, e.g., copper cyanide complexes. This chemistry may also provide a route to assembly of pyrimidine ribonucleotides, a task that had heretofore proven difficult, though not impossible (Sanchez and Orgel 1970). If an organic haze was present, then cyanamide and cyanoacetylene could likely have been formed by reactions analogous to those that form HCN, which may have been prevalent under early Earth atmospheric conditions (Zahnle 1986; Tian et al. 2011). From there, other authors have focused on the behavior of cyanoacetylene, its potential contribution to the origins of life, and the difficulties associated with more complex prebiotic chemistry (Robertson and Miller 1995; Shapiro 1999; Nelson et al. 2000; Orgel 2002, 2004). We have not investigated such reactions explicitly, but sufficient available HCN and cyanoacetylene are two of the requirements of the mechanism proposed by Ritson and Sutherland. So, both of these routes for GA production could be promising.

Delivery of Glycolaldehyde from Space

Finally, a third method of acquiring GA is delivery from space. The organic content of meteorites is comprised principally of macromolecular organic material (Gardinier et al. 2000), while a large part of the remaining fraction would be single-carbon species, such as methane, methanol, and FA. From the molecular abundances of ices in comets, methyl formate (HCOOCH3), an isomer of glycolaldehyde, represents less than a percent of the volatile composition (Ehrenfreund et al. 2002). Meteorites and comets (Chyba et al. 1990), or interplanetary dust (Anders and Grevesse 1989), could have contributed anywhere from 108 to 1010 kg yr − 1 of organics to the early Earth. If GA represents the same fraction of organic material as its isomer methyl formate (~1 %), the amount of GA delivered from space is between 106 and 108 kg yr − 1 (~2 × 107 − 2 × 109 mol yr − 1). This is likely an overestimation, considering that the Murchison meteorite contains ~27 ppm of aldehydes and ketones (which includes both FA and GA) (Ehrenfreund et al. 2002), but the specific contribution of GA to the early Earth from all exogenous sources is largely unconstrained. From our photochemical model, the production of GA directly from the atmosphere ranges from 107 kg yr − 1 (best-case) to 100 kg yr − 1 (worst-case), while production of GA from the formose reaction at the vents would be 1–2 × 108 kg yr − 1. This means that the best-case organics delivery from space would match the expected generation of GA by the formose reaction at the vents, and vent formation would dominate for more conservative values of exogenous delivery.

Conclusion

We investigated production of glycoaldehyde from three different mechanisms: (1) atmospheric photochemistry, (2) the formose reaction (in raindrops, the surface ocean, and at hydrothermal vents), and (3) delivery from space. The latter two mechanisms both can both generate a maximum of ~2 × 108 mol yr − 1 of GA, while mechanism (1) generates an order of magnitude less. None of these mechanisms can produce oceanic GA concentrations exceeding ~2 mM. By comparison, experimental synthesis of RNA by the mechanism of Powner et al. (2009) requires 1–2 M GA. So, unless GA could somehow be greatly concentrated, by evaporation or eutectic freezing, for example, prebiotic synthesis of RNA could not have proceeded in this way.

Future research should concentrate on identifying alternative pathways for producing glycoaldehyde in the early atmosphere or oceans, or on alternative mechanisms for generating RNA, presuming than an RNA world was the first step to life as we know it. Regardless of the model assumed for the origins of life, sugars are an essential part of extant life on Earth, and represent a critical and necessary component.