The field of self-replication chemistry [123] currently aims to understand and to control the energy landscape determining the type of autocatalytic growth, which distinguishes the chemistry of parabolic coexistence[24] from the biological physics building on the Darwinian principle [25]. Small organic replicators, especially those which are the design offspring of the system introduced by Wang and Sutherland [16], offer particular advantages in the quest for a more detailed understanding of replicator dynamics: They are large enough to exhibit autocatalysis coupled to information transfer and they are small enough to be treatable from first principles. Here we introduce a replicator which utilizes a fulvene-based Diels-Alder reaction (see Fig. 1) and show that the conjunction of ab initio molecular dynamics (AIMD) as developed by Car and Parrinello[26] with NMR kinetics supported by computed chemical shifts and 2D-NMR methods allows us to decipher the structural and energetic rationale behind the observed behaviour, while static computational methods currently used in the field did not reproduce the experimental data.

Figure 1
figure 1

Possible reaction pathways of the system. a) A two-letter code is used to name products. The first refers to endo(N) or exo(X) Diels-Alder products while the second characterizes the direction of the amidopyridine recognition site (N: same side as bridge, X: opposite side of bridge). Equilibrium constants are given as association constants. The difference between A·B·NX and A·B·NX* is the relative alignment of A and B. Hydrogen atoms are omitted for clarity (except OH and NH). b) The bimolecular background reaction is shown in 2 D as an example. c) A*·B- (2D) and A·B-complex (3D) with almost no screening effect. d) A*·B (2D) and A·B-complex (3D) with a good screening effect of the top side. 3D-structures in (c) and (d) were obtained by geometry optimizations using density functional theory.

We selected fulvene chemistry for replicator construction for two reasons: First, fulvene chemistry allows facile variation of the diene part. Second, with respect to an earlier described system [17], we aimed to arrive at a nonchiral reaction product in order to simplify the kinetic analysis. In principle, there are four possible products -- two endo and two exo diastereomers -- from the reaction of A and B, three of them being observed by NMR (Fig. 2). Only endo-products (NN, NX) are able to replicate via a termolecular complex. The kinetics of the reaction of A with B were studied by time-resolved 1H-NMR (600 MHz) in CDCl3 at 293 K (Fig. 2a). One main product is apparent whose curve exhibits a sigmoidal shape caused by an induction period, which is typical of autocatalytic reactions. To prove the specificity of the catalysis 2 eq benzoic acid was added as a competitive inhibitor, resulting in a deceleration of the reaction. On the other hand, when a 10% product mixture -- corresponding to almost pure main product -- was added as a catalyst at the start, the production of the main product was accelerated and the induction period almost vanished [see Additional file 1]. We observed that product precipitation is kinetically hindered even for days. Once precipitated however (e.g. by solvent evaporation) the products could not be dissolved. The latter prevented chromatographic separation and as a consequence discrete addition of each product. Another measurement was carried out using the methylester A' to disable molecular recognition in the system, allowing independent measurement of the background reaction. As expected, the resulting concentration-time data does not show an induction period and the reaction proceeds much more slowly (Fig. 2a). Chromatographic separation by high performance liquid chromatography (HPLC) and identification of the background reaction products by rotating frame nuclear Overhauser enhancement spectroscopy (ROESY) revealed that only NN and NX are formed. There is a slight diastereoselectivity of 2:3 for NX, which must be due to subtle electronic effects, as there are no steric interactions disfavouring the formation of NN. However, if the formation of an unreactive A*·B-complex as a model for the real A·B-complex was allowed by adding one equivalent of succinimide A* (Fig. 1c/d) both isomers were formed with identical rates. This can be explained by a more effective screening of one side of the fulvene in the A*·B-complex rotamer depicted in Fig. 1d. It is evident from the respective three-dimensional models of the real A·B complex that the same argument holds for the autocatalyic system. As the background reaction dominates the system at early reaction times when A·B complexes are the predominant species, it can be assumed that both templates are formed with identical rates via this pathway.

Figure 2
figure 2

Concentration-time-curves and NMR data for all products. a) Black circles show the autocatalytic reaction, empty squares the reaction in the absence of recognition sites using A'. The rate of formation of NX is lower than the formation of NX' in the background reaction as precursors are consumed predominantly by autocatalytic pathways leading to NN. b) One set of pyridine protons allowed to extract concentrations of all three products from NMR measurements. The assignment was supported by the calculated shifts shown here. c) Cutout of a ROESY spectrum showing all three NH-protons. A coupling indicating chemical exchange is only visible between NN and NX.

As most signals of different product isomers in the autocatalytic reaction overlapped due to structural similarities, it was impossible to determine the composition of the product mixture from 1D-NMR. Again, we applied ROESY after the reaction was completed (see Fig. 2c). The obtained spectra clearly showed that the amide NH-proton of one isomer does not undergo chemical exchange, while it was observable for the remaining two isomers. Only XX is expected to exhibit intramolecular hydrogen bonding and therefore no chemical exchange; its existence in the mixture was further supported by cross peaks indicating an exo-Diels-Alder product. The presence of XN could be ruled out by the following reasoning: ROESY spectra and HPLC plots of the background reaction using methylester A' did not show any exo-products. The occurence of an exo-product in the presence of recognition sites then means that a recognition-mediated reaction pathway is exploited. Such a pathway is only available for the XX isomer via an A·B-complex. The remaining two products were identified as the NN (main product) and NX isomers (side product).

Since the catalytic properties of the indivual products could not be characterized by experiment, we performed AIMD free energy calculations of the A+B [4+2]-cycloaddition in the presence of different template isomers using the recently proposed dynamic distance constraint[2729], which controls the root mean square distance, D, between the two pairs of carbon atoms involved in the cycloaddition. This 'dynamic' distance D was varied incrementally from 3.6 Å (precursors) to 1.6 Å (product) and free energy profiles were obtained by thermodynamic integration. These are compared to minimum energy paths (MEPs) obtained from constrained geometry optimizations along this reaction coordinate. Fig. 3 clearly shows that there are significant entropic contributions separating the free energy curves from the MEPs. Moreover, the MEPs display a substantial ruggedness due to the fact that at each value of D the highly flexible structures B and NN or NX possess a large number of local minima, further demonstrating the need to carry out molecular dynamics studies. Hence interpreting replicators purely on the basis of 0 K energy barriers obtained from optimized transition state structures bears considerable uncertainties [17, 1921, 23]. In the case of the NN product, for instance, the reaction barrier of the MEP appears to be lower for the templated than for the non-templated reaction by 13 kJ/mol. Crucially, however, the free energy profiles are nearly identical. Thus the catalytic effect of the template is not the result of a change in electronic structure of the A+B transition state, but rather of a change in molecularity of the ligation reaction from second-order to pseudo first-order without being energetically penalized.

Figure 3
figure 3

Energy profiles for the non-catalyzed and catalyzed formation of NN. Free energies (■) for the non-templated (a) and templated (b) ligation leading to an NN isomer are almost identical, while the MEPs (□) are different and clearly separated. D = 3.6 Å corresponds to precursors, D = 1.6 Å to the product. Errors for free energies are calculated from fluctuations of the constraint force.

AIMD simulations for different reaction paths of the system revealed that the NN template is only able to catalyze its own formation. On the other hand, the NX template cannot only reproduce itself, but also catalyze the formation of an NN template. Both autocatalytic reactions feature similar free energy barriers -- 65 and 66 kJ/mol, respectively -- while the cross-catalytic pathway leading to a new NN product templated by NX has a slightly higher barrier of 72 kJ/mol. Calculated free energy profiles ordered by averaged absolute energies of template duplexes are displayed in Fig. 4. Obviously, an inward-pointing bridge of an NX template destabilizes complexes by repulsive steric interactions, promoting NX·NX to an even higher energy than NN·NX due to a closer spatial proximity of both bridges. The same destabilizing effect is present in A·B·NX*, as B is very close to the bridge of NX. This finding combined with the "selfish" behaviour of the NN isomer means that it will outcompete the NX isomer, which remains a side product. The reason why the XX isomer is observed in the recognition-mediated reaction as a side-product is that its formation is entropically favoured by proceeding in a pseudo-intramolecular fashion and therefore accelerated (Fig. 1).

Figure 4
figure 4

Energy profile for templated reaction pathways. Inward pointing bridges cause repulsive interactions leading to a destabilizing interactions. The same effect also causes a higher energy for a preorganized A·B·NX* complex. The relative order of free energy profiles was calculated from average energies of MD simulations of the termolecular complexes.

Although the composition of the experimental product mixture could be elucidated by 2D-NMR and calculated free energy profiles, an assignment of isomers to 1D-NMR signals was still necessary for a kinetic modelling of the system. The assignment of the NN isomer is straightforward, as it is the main product, but it is difficult to distinguish between the NX and XX isomer on the basis of the available data. For a direct assignment of the experimental NMR spectra we calculated thermally averaged ab initio chemical shifts. A comparison of calculated and experimental shifts for the set of non-overlapping protons used to extract time-dependent concentrations (see Fig. 2a/b) shows a remarkable agreement for both isomers with a deviation of just 0.05 ppm for XX and 0.03 ppm for NX, respectively. Our final assignment was supported by these shifts and corroborated by the fact that an inverse assignment did not allow for a good fit of the experimental kinetic data to models that were in accordance with results from our calculations. A 16:1 diastereoselectivity for NN was determined by integration of the respective NMR peaks, which is a true emergent property, as it results exclusively from the interactions between templates and precursors. It even reverses the slight selectivity for NX in the background reaction.

Our kinetic model was constructed based on information about possible reaction channels from AIMD simulations (see Figs. 1 and 4). Complex equilibria of A·B·NN and A·B·NX complexes were modeled with the same association constant, while A·B·NX* was modeled with a separate association constant to account for different relative complex energies. For the same reason all three duplex equilibria were attributed different association constants. Different rate constants were assigned to autocatalytic and crosscatalytic ligations. The rate constant for uncatalyzed reactions to NN and NX was known from separate measurements of the background reaction. Complex associations were assumed to be limited only by diffusion. In order to quantify the rate constants for these processes separate classical MD simulations of A, B and NN in chloroform were performed and the diffusion constant -- which is proportional to the rate constant in this scenario -- was determined from the center-of-mass mean square displacement via the Einstein relation. Thus we arrived at rate constants of the order of 1010 M-1s -1 for all diffusion limited processes. Kinetic data was fitted to the model using Simfit [30] to obtain rate and equilibrium constants. According to the model, the cycloaddition of A+B is rather efficiently catalyzed in the presence of NN or NX, the rate constant kauto being about 50 times larger than knon for a non-catalyzed background reaction (corresponding to an effective kinetic molarity of 50 M). The crosscatalytic mechanism is less efficient, its rate constant kcross is predicted to be approximately one half of kauto, which is in agreement with our calculated free energy profiles. Furthermore, kauto is four orders of magnitude larger than the rate constant kXX of the XX formation via an AB channel. This means, although still present, this undesirable pathway is sufficiently suppressed. Template duplexes are predicted to be more stable than termolecular complexes, suggesting that the system suffers from product inhibition. Interestingly, association constants for different termolecular complexes and duplexes reflect the relative order of calculated complex energies. All in all, the model is able to describe the dynamic behaviour of the system very well. Nevertheless, one has to keep in mind the system's complexity and very limited amount of accessible observables. As a consequence, kinetic and thermodynamic parameters obtained by kinetic fitting cannot be expected to be highly accurate. On the other hand, our method allowed us to construct a meaningful model in the first place, which would have been impossible without access to free energy profiles of all major reaction paths.

Complex reaction networks with interesting dynamic signatures in which obstacles like chemical lability or similarity lead to an incomplete base of solid chemical knowledge are expected to challenge chemistry in the future. Our approach of merging experimental NMR kinetics with ab initio dynamical chemical shifts and free energy landscapes enabled us to comprehend a dynamic puzzle which otherwise would have had to remain unsolved.