Introduction

Protonation of nucleic acid bases has been known to cause certain alternations in DNA structures, which may affect related biological processes. A well-known example is the formation of DNA i-motif [1,2,3,4]. Base pairing involved in proton-bound cytosine (C) base pairs, C:H+∙∙∙C, which occurs in the tetrameric structures in DNA i-motif, is known to include three hydrogen bonds, one of which is an ionic hydrogen bond [5,6,7]. The triply-bound structure of C:H+∙∙∙C was clearly characterized by infrared multiple photon dissociation (IRMPD) spectroscopy of the proton-bound dimer of 1-methylcytosine (1-MeC) in the gas phase [8]. IRMPD spectroscopy was further used to interrogate the triply-bound base pairing involved in the proton-bound dimers of 1-MeC and its derivatives [9] and those of C and modified Cs [10, 11]. IRMPD spectroscopy combined with ion trap mass spectrometry has offered a powerful means to probe the structures of gaseous proton- or metal ion-bound biomolecules and their complexes [12,13,14,15]; it has also successfully elucidated the structures and energetics of proton-bound base pairs of adenine (A), (A:A:H)+ [16], C and guanine (G), (C:G:H)+ [17], and G tetrads [18] in a combined approach with quantum chemical calculations.

As for the ionic hydrogen bond involved in the triply-bound C:H+∙∙∙C, it was theoretically predicted that the proton transfer in the ionic hydrogen bond can occur at a high rate constant of 6 × 1011 s−1 due to the low energy barrier of 1.6 kcal/mol and quantum mechanical tunneling [7], which was later observed by an NMR study (> 10−8 s−1) [19]. While protonation of the N3-position of C has been found to participate in many base-pairing cases of protonated C [8,9,10,11, 19] including in DNA i-motif structures [1], proton-bound Hoogsteen base pairing of (C:G:H)+ [17, 20], (C:G:C:H)+ triplet [20] and triple helixes [21], protonation of the O2-position of C is also known to occur and cause a mismatched base pairing of A-C in DNA [22, 23].

In a complementary relation to IRMPD spectroscopy, tandem mass spectrometry utilizing collision-induced dissociation (CID) has been an important tool in elucidating binding energetics and related thermochemistry. For example, Cooks’ kinetic method has been useful in determining the thermochemical properties of molecules (B1, B2), when they are involved in a loosely bound ion pair by an ionic hydrogen bond, B1∙∙∙H+∙∙∙B2. The kinetic method suggests that the fragment abundances of [B1:H+] and [B2:H+] can be directly related to the enthalpy of protonation (ΔHp) or the proton affinity (PA) of the moieties in a limiting case in which the entropy effect is essentially negligible [24,25,26].

In an approach known as threshold collision-induced dissociation (TCID) [27], the abundances of fragments, which approximate the reaction rate constants (k), were measured in near-single collision conditions using well-characterized guided-ion beam tandem mass spectrometers [27,28,29,30,31,32]. TCID experiments were applied to investigate various base pairs of protonated Cs, such as methylated and halogenated Cs and deoxycytidines [33,34,35,36,37]; these experiments yielded insight into base-pairing interaction by making accurate determination of binding energetics and PAs. In addition, CID experiments to measure depletion of parent ions in an energy-resolved way (ER-CID), in which parent ions experience multiple collisions in an ion trap, also provided a reliable measure of relative stability of parent ions. ER-CID experiments, often combined with IRMPD spectroscopy, have offered important information including relative stabilities and conformations of biomolecules and cluster ions such as nucleosides and G-tetrads in the gas phase [18, 38, 39].

On the other hand, an important example of DNA structural variations by protonation is proton-bound Hoogsteen base pairing [20, 40], in which protonated C binds to a Watson-Crick (WC) base pair of G-C, giving rise to the formation of a C:H+∙∙∙G-C triplet and triple strands of DNA nucleotide oligomers [21, 41]. While WC base pairing is more stable at physiological conditions, protonated Hoogsteen base pairs can be stabilized and occur under certain circumstances [42,43,44,45]. This formation of triplexes is related to human diseases [46] such as Friedreich’s ataxia [47], and it also received research attention for its potential applications in therapeutics [48]. As for the motif of proton-bound Hoogsteen base pairing, the binding interaction of CH+∙∙∙G dimer and CH+∙∙∙G-C triplet was predicted to possess a strong interaction energy of 38.2 kcal/mol for the proton-bound Hoogsteen base pair. The energy barrier of proton transfer in the ionic hydrogen bond was also low at about 4 kcal/mol [20].

In contrast to the triply-bound protonated dimers of C, the experimental efforts to understand the base pairing involved in Hoogsteen base pairs at the molecular level have been rather limited to date. While the isolated protonated complex of C and G, (C:G:H)+, was previously produced by electrospray ionization (ESI) [49], its structures have not been identified until a recent IRMPD study. Very recently, an IRMPD study performed in the spectral range of 900–1900 cm−1 along with differential ion mobility spectroscopy revealed that the isolated (C:G:H)+ complex produced by ESI under acidic conditions (pH = 3.2) dominantly possesses the proton-bound Hoogsteen conformation of C:H+∙∙∙G with a population higher than 66% [17]. The IRMPD study also found that the preferred conformations of (C:G:H)+ varied depending on the pH of the ESI solutions. It was found that ESI preferentially produced the proton-bound WC base pair (91%) at pH = 5.8. The identification using IRMPD at the acidic conditions (pH = 3.2) is in a good accordance with a recent quantum chemical study for various conformations of (C:G:H)+ [50]. That theoretical study explored a variety of conformations of proton-bound dimeric structures complexed by multiple hydrogen bonds. As a result, the proton-bound Hoogsteen base pair was proposed as the most stable conformation in the gas phase and also in the aqueous environment; its calculated thermal free energy suggested that it would be predominant in the gas phase (> 90%).

In fact, the recent IRMPD spectroscopy [17] and early CID experiments [49], which have in common the excitation mechanism of slow heating, noted an intriguing feature in CID of a proton-bound Hoogsteen base pair, C:H+∙∙∙G. In view of the kinetic method [24], CID is expected to occur with preferential formation of G:H+ rather than C:H+, as the PA of G is larger than that of C. However, both CID and IRMPD experiments found an anomaly in dissociation, in which the formation of C:H+ product was more pronounced than that of G:H+. Accordingly, while the recent IRMPD study has begun to shed light on the base pairing in isolated proton-bound Hoogsteen base pairs, the CID behavior still leaves an interesting question to answer. In this regard, in an effort to understand the intriguing behavior observed in the CID of (C:G:H)+, we performed a combined approach of ER-CID experiments in multiple and near-single collision conditions with theoretical calculations for a homologue of proton-bound base pairs of C, 1-MeC, and 5-methylcytosine (5-MeC) with G as a common pairing moiety.

Experimental and Theoretical Methods

Materials

Chemicals including cytosine (C), guanine (G), and 5-methylcytosine (5-MeC) were commercially obtained from Sigma-Aldrich Korea (Suwon, Korea). 1-Methylcytosine (1-MeC) was purchased from Synchem UG & Co. KG (Felsberg, Germany). Other chemicals including HPLC-grade acetonitrile and acetic acid were also obtained from Sigma-Aldrich Korea.

ER-CID Experiments in Multiple Collision Conditions

For ER-CID experiments in multiple collision conditions, an ion trap mass spectrometer (LTQ XL, Thermo Scientific, MA, USA) was utilized. The proton-bound heterodimers were produced by electrospray ionization of binary sample mixtures (200 μM each) dissolved in a solution of water/acetonitrile/acetic acid (49:49:2, v/v/v). The solution was directly infused into the mass spectrometer at a flow rate of 5 μl/min, and the temperature of the mass spectrometer inlet capillary was kept at 150 °C. For the CID experiments, the excitation energy was set in terms of the normalized collision energy (N.C.E.), which is the instrument’s own parameter in arbitrary unit (a.u.). The scan step was 0.1 or 0.05 depending on the required resolution. The collision gas was He (~ 1 mTorr) and the ions were excited for 30 ms, during which ions experienced multiple collisions with the collision gas in the linear ion trap. The resultant survival yield (S.Y.) for parent dimer ions as a function of N.C.E. was monitored as the ratio of the ion intensity of surviving parent ion, I(p), to the sum of ion intensities of all fragments, I(f), and I(p) measured in the CID spectra; S.Y. = I(p) / Itotal = I(p) / (Σ Ii(f) + I(p)). Similarly, the formation yield (F.Y.) of a fragment, f, was defined as F.Y. (f) = I(f) / Itotal = I(f) / (Σ Ii(f) + I(p)). Using the least square method, the obtained S.Y. curves were fitted to the curve in this study, S.Y. = [1+ (N.C.E. / CID50%)n]−1, where the proton-bound base pairs effectively underwent complete dissociation at high collision energies. The fitted value of CID50% corresponds to the collision energy at which 50% depletion of parent ion occurs, which represents the relative stability of the proton-bound dimers. The value n represents the steepness of the declining region of the fitted curve [51].

ER-CID Experiments in Near-Single Collision Conditions

For near-single collision conditions, the ER-CID experiments were performed on a quadrupole tandem mass spectrometer (Xevo TQ, Waters Corporation, Manchester, UK), which is equipped with a stacked-ring ion-guide collision cell in place of a conventional multipole cell. The stacked-ring ion guide in the cell produces an effectively static pseudopotential that is much steeper at the ring edge when compared to that of common multipole ion guides. It is effectively immune to the loss of parent and fragment ions flying through the collision cell. The advantages of this collision cell coming from static rather than radio-frequency potentials include low electrical noises as well as a large field-free region near the central axis of the guide. In this study, we turned off the DC voltages for the RF and DC barriers, and the traveling wave was only applied to the collision cell [52, 53]. Thus, the scanwave cell served just as a ring-electrode collision cell, i.e., a fragmentation and ion-guide device. For CID, Ar was used as the collision gas; this was the default setting for this instrument. The flow rate of collision gas into the collision cell was carefully controlled by adding an extra high-precision metering valve (Metering HR series, Parker Hannifin, Corp.) so that the pressure inside collision cell could be adjusted to as low as 0.06, 0.10, and 0.15 mTorr, which values correspond to near-single collision conditions in this instrument. The effects of collision gas pressure were also examined; a small effect of gas pressure, which slightly increased the fragmentation yield as the pressure increased, could be observed. The CID behaviors for all the proton-bound base pairs were carefully examined at the three pressures, which ensured no significant alteration of the ER-CID data. In addition, fragmentation that occurs without introduction of collision gas was measured and subtracted from the raw CID data for background correction. However, such background fragmentation was negligible in this threshold energy region.

The collision energy in the laboratory frame (Elab) was controlled at an energy step of 0.2 or 0.5 eV. The center-of-mass energy (Ecom) was obtained using the relation Ecom = (Elab + Vc) × mAr / (Mion + mAr), where Mion is the mass of base pair ions and mAr is that of Ar. However, the setting value of collision energy (Elab) can be influenced by certain factors such as residual fields in the inlet and outlet parts of the collision cell, inaccuracy in applied voltages, contact potentials, and so on, which are difficult to assess when using a commercial instrument. To account for such effects on the collision energy in the laboratory frame, we introduced a single-parameter calibration by employing a potential correction (Vc) in the above relation; this parameter was estimated in comparison with the reported CID results for (1-MeC:C:H)+, (5-MeC:C:H)+ [35], cytidine:H+ [54], and guanosine:H+ [55], and which led to an estimate of 2.54 V for Vc for this study. In addition, unlike guided ion-beam instruments specially designed or modified for measurement of threshold CID cross sections [27,28,29,30], instrumental parameters such as the effective length of the collision cell are not known. For this reason, we refrained from assessing collision cross sections from the observed ER-CID data obtained in the near-single collision conditions.

Theoretical Methods

Theoretical calculations were performed using the Gaussian 09 program package [56]. Density functional theory using the B3LYP hybrid functional was used in this study, which has been reliably employed in many theoretical investigations of nucleic acid bases and base pairs [20, 34,35,36,37]. The minimum energy structures were optimized without structural constraints. Vibration frequency calculations for the optimized structures were carried out to examine if the structures represented true minima on the potential energy surfaces. The calculations also yielded thermochemical quantities such as enthalpy (H) and Gibbs free energy (G) at 298.15 K. For the energetics, the electronic energy difference values with and without zero-point energy corrections, ΔE0 and ΔEe, respectively, were calculated. As for the interaction energy between the two moieties in a base pair, A+∙∙∙B, the dissociation energy (D0 or De) defined as D = [E(A+) + E(B)] − E(A+∙∙∙B) was evaluated. The complexation energy corrected for basis set superposition errors (DBSSE) using the relaxed counterpoise method was also calculated [57]. Polarizable continuum model (PCM) calculations were performed when the energetics in the aqueous environment was examined [58].

To locate TS, potential energy scan was performed for large areas on the potential energy surfaces. TS was then located using QST3 calculations [59], which were further examined by frequency calculations. To accurately measure energies for monomers including PA and ΔG, the MP2/aug-cc-pVDT level of theory was also applied; this theory is known to perform in good agreement with the CCSD(T) theory [60, 61]. In comparison with the values obtained by B3LYP density functional, calculations using the MP2/aug-cc-pVDT theory at B3LYP-optimized geometries were also carried out.

Results and Discussion

Anomalous Fragmentation Yields in CID of (C:G:H)+

The kinetic method has offered a simple and accurate way of determining thermochemical quantities by monitoring competitive cleavages of a series of weakly bound complexes that include a common ion [24,25,26]. This method suggests that in a competitive dissociation, the formation rate constants (ki), which are approximated by the respective fragment ion abundances, depend on the enthalpy difference between the two competitive product channels, if the entropic effects for the two fragmentation pathways cancel. This gives the following relation for the dissociation of a proton-bound complex of A∙∙∙H+∙∙∙B, leading to the formation of A:H+ and B:H+ [24]: ln(k1/k2) = ln ([A:H+]/[B:H+]) ≈ Δ(ΔG)/RTeff ≈ Δ(ΔH)/RTeff, where ki accounts for the rate constant for fragmentation channel i, [A:H+] for the abundance of A:H+ fragment ion, Δ(ΔG) and Δ(ΔH) for the differences in Gibbs free energy and enthalpy between the two product states, respectively, and Teff is the effective temperature.

The schematic potential energy diagram for the proton-bound Hoogsteen base pair of C:H+∙∙∙G is given in Figure 1. In the case of C:H+∙∙∙G, the TS is in general considered to be loose and product-like, so the property of TS can be deduced from the corresponding product state. The literature values for the entropies of protonation for C and G are small and essentially the same [62]. Thus, it is expected that the branching ratio between the two channels producing G:H+ and C:H+ is largely determined by the PAs of the two moieties; G and C. G is known to possess a higher PA (229.3 kcal/mol) than that of C (227.0 kcal/mol); these values were experimentally measured by the kinetic method using pyrrolidine [62, 63].

Figure 1
figure 1

Schematic energy diagram for CID of the proton-bound base pair of C and G, C:H+∙∙∙G. The values are ΔEe (kcal/mol) from the most stable, Hoogsteen structure (Table 2)

This study calculated the thermochemical quantities for protonation of G and C; a larger ΔHp (298.15 K) value of 228.1 kcal/mol for G than that of 226.8 kcal/mol for C when C is protonated at the N3-position at the theory level of B3LYP/6-311+G(2d,p) (Table 1). The molecular structures of N3-protonated C, C(N3):H+, O2-protonated C, C(O2):H+, and N7-protonated G, G(N7):H+, along with that of proton-bound Hoogsteen base pair, C:H+∙∙∙G, are given in Figure 1S (Supporting Information). As for ΔGp (298.15 K), values of 228.5 and 226.7 kcal/mol were predicted for G and C, respectively. In addition, the difference between Δ(ΔGp) and Δ(ΔHp), i.e., T∙Δ(ΔS), is calculated and found to be small at only 0.5 kcal/mol. This suggests that Δ(ΔHp) largely accounts for Δ(ΔGp) suggesting that the entropic effect in product states and probably that in the product-like TSs are negligible. In fact, the kinetic method is known to be suitable for accurate determination of PAs of nucleic acid bases and nucleosides [64]. In this regard, the application of the kinetic method is reasonable for predicting the dissociation behavior of the C:H+∙∙∙G base pair.

Table 1 Predicted Thermochemical Properties for Protonation of Monomers (298.15 K); Protonation Was Considered at the N3-Position of C and the N7-Position of G. The Values Obtained for Protonation at the O2-Position of C Are Also Given in Parenthesis (B3LYP/6-311+G(2d,p)) (kcal/mol) (see also Table 1S)

Form the kinetic method, it can be expected that G:H+ would be more abundant than C:H+ in fragmentation of C:H+∙∙∙G near the dissociation thresholds due to the larger PA of G; this was, however, not the case according to the previous findings obtained by CID and IRMPD experiments [17, 49]. There may be certain dynamic reasons that cannot be taken into account by simple consideration using the kinetic method. To address this unexpected observation more carefully, we investigated the ER-CID of (C:G:H)+, (1-MeC:G:H)+, and (5-MeC:G:H)+ both in multiple and near-single collision conditions. Methylation of C was exploited to vary the PA of C moiety systematically, while keeping G as a common counterpart for protonated Cs. The predicted energy order of PA was 1-MeC > 5-MeC > G > C (Table 1), which is in a good agreement with the TCID results [33, 35]. (1-MeC:C:H)+ was also examined for comparison, as its structure and threshold dissociation have been well understood from previous IRMPD and TCID studies [10, 35].

Lowest Energy Structures: Proton-Bound Hoogsteen Base Pairs

To gain insight into the structures and energies of protonated complexes of (C:G:H)+, (1-MeC:G:H)+, and (5-MeC:G:H)+, which were produced by ESI in this study, we carried out a theoretical study at the B3LYP/6-311+G(2d,p) theory level. Figure 2S displays the six lowest energy conformers for (G:C:H)+, which were adopted from the previous study [50], along with the predicted energy differences of ΔE0 and ΔG obtained at the present theory level from the most stable conformer, the proton-bound Hoogsteen base pair (I). Predicted energetics for the lowest energy structures can be found in Table 2S. Briefly, an ionic hydrogen bond is involved in the base pairing of all the lowest energy structures. The Hoogsteen base pair (I) is the most stable structure; other conformers (II–VI) lie far above, at least 2.5 kcal/mol higher, in energy (ΔE0). The other conformers of II–VI are located very close to one another within the energy range of 0.5 kcal/mol. Their energy order varies slightly depending on the employed theory levels [50]. Some conformers are related by a proton transfer reaction in the ionic hydrogen bond such as I and IV as well as II and V. In the Hoogsteen base pair (I), the proton is attached to the N3-position of C rather than to the N7-position of G moiety with a PA that is higher by 1.5 kcal/mol [20]. In II, the structure that is antiparallel to I, the proton is attached to the N3-poistion of the G moiety rather than the N3-poistion of C in V. While the N3-poistion is the frequent protonation site in the C moiety, O2-protonated C is also found to participate in the base pairing as in III.

In this study, we considered the exhibited structures of the six lowest conformers as favorable ways of base pairing between C- and G-based molecules; we extended this idea to examine the proton-bound base pairs of (1-MeC:G:H)+ and (5-MeC:G:H)+, as they differ only in methylation at C. The calculated results for the conformers of (1-MeC:G:H)+ and (5-MeC:G:H)+ are presented in Figures 3S and 4S, with predicted energetics summarized in Tables 3S and 4S, respectively. As shown in the results for (1-MeC:G:H)+, base pairing of Hoogsteen type represented the most stable conformer (I). The other conformers lie more than 2.8 kcal higher in energy and are very close to one another, within 0.5 kcal/mol. Similarly, the proton-bound Hoogsteen base pair is also the most stable conformer for (5-MeC:G:H)+ and is at least 2.7 kcal/mol more stable than the next most stable conformer (II). Solvent effects considered by PCM calculations suggested that the Hoogsteen base pairs are one of the most stable conformers in the aqueous environment. In addition, it was predicted that the protonated WC G-C base pairs were far less stable by 4.9, 5.9, and 5.5 kcal/mol (ΔE0) than their respective Hoogsteen base pairs of C:H+∙∙∙G (I), 1-MeC:H+∙∙∙G (I), and 5-MeC:H+∙∙∙G (I). Therefore, this theoretical study suggests that the protonated Hoogsteen base pairs may be the most stable and thus the most probable conformers in the gas phase.

The optimized structures of proton-bound Hoogsteen base pairs are given in Figure 2. The dissociation energies were calculated and are given in Table 2; these values are predicted to be close in energy, in the order of (1-MeC:C:H)+ (39.5 kcal/mol in D0) > (5-MeC:G:H)+ (38.9 kcal/mol) > (C:G:H)+ (38.2 kcal/mol) > (1-MeC:G:H)+ (38.0 kcal/mol), if the dissociation to formation of lowest energy fragments between the two competitive channels was considered. While dissociation of (1-MeC:C:H)+, as found in TCID studies, does not give rise to the formation of O2-protonated C, C(O2):H+ [33, 35], the dissociation energies for the fragmentation channels involving formation of C(O2):H+ were also evaluated (Table 2). In fact, O2-protonated molecules of C and nucleosides have been known to exist in the gas phase [65, 66].

Figure 2
figure 2

The predicted lowest energy structures, proton-bound Hoogsteen base pairs, for (C:G:H)+, (1-MeC:G:H)+, and (5-MeC:G:H)+ as well as (1-MeC:C:H)+. Plus sign denotes the attached proton

Table 2 Predicted Energetics for Dissociation of the Proton-Bound Hoogsteen Base Pairs of (C:G:H)+, (1-MeC:G:H)+, and (5-MeC:G:H)+ (B3LYP/6-311+G(2d,p)) (kcal/mol)

ER-CID Experiments: Anomalous CID Behaviors of Proton-Bound Hoogsteen Base Pairs

ER-CID in multiple collision conditions offers a reliable probe of the relative stability of protonated biomolecules and complexes [18, 38, 39]. In this study, we employed the ER-CID method to examine the proton-bound base pairs produced by ESI. Briefly, proton-bound base pairs were exposed to multiple collisions with He (~ 1 mTorr, 30 ms) in a linear ion trap, and the resulting CID spectra were examined in terms of S.Y. and F.Y. as a function of N.C.E (a.u.). The CID50% value obtained for S.Y. represents the energy required to deplete 50% of the parent base pairs; S.Y. thus serves as a parameter indicating the relative stability of the complexes. The plots of S.Y. of (C:G:H)+, (1-MeC:G:H)+, and (5-MeC:G:H)+ as a function of N.C.E. are given in Figure 3; least square fittings gave the CID50% values of 10.9, 10.6, and 10.3, respectively. These are lower than the CID50% value of 11.8 obtained for the triply-bound (1-MeC:C:H)+ as well. In fact, the measured CID50% values for the proton-bound base pairs are very close, which suggests that their base-pairing strengths are comparable, perhaps via the same type of base-pairing interaction, i.e., Hoogsteen type, as predicted in the theoretical study (Figure 2). However, the order deviates slightly from the predicted energy order (D0) of (1-MeC:C:H)+ > (5-MeC:G:H)+ > (C:G:H)+ > (1-MeC:G:H)+.

Figure 3
figure 3

S.Y. data as a function of N.C.E. (a.u.) obtained by ER-CID in multiple collision conditions for (C:G:H)+ (red circle), (1-MeC:G:H)+ (black square), (5-MeC:G:H)+ (blue up-pointing triangle), and (1-MeC:C:H)+ (pink down-pointing triangle)

The ER-CID data were further examined in terms of F.Y. In this CID study of weakly bound non-covalent base pairs, the excitation energy range was quite low, so only the main channel products of C:H+ (or methylated C:H+) and G:H+ appeared in the CID spectra. Figure 4 displays F.Y. as a function of N.C.E. The dissociation of (1-MeC:G:H)+ and (5-MeC:G:H)+ in multiple collision conditions showed predominant generation of 1-MeC:H+ (ΔPA0 = PA0(1-MeC) – PA0(G) = 2.0 kcal/mol > 0) and 5-MeC:H+ (ΔPA0 = 2.1 kcal/mol > 0), which are the moieties with higher PA values than that of the counterpart, G. The observed formation of G:H+ was very weak at only a few percent of the population. The dissociation of (1-MeC:C:H)+ was also similar, and 1-MeC:H+ (ΔPA0 = 3.6 kcal/mol > 0) was pronounced in the CID spectra. Accordingly, the results qualitatively agree well with those expected from the kinetic method. However, the CID of (C:G:H)+ is drastically different and indeed anomalous as noticed in the previous experiments [17, 49]. In the ER-CID of (C:G:H)+, formation of C:H+ (ΔPA0 = − 1.3 kcal/mol < 0) was more pronounced than that of G:H+, despite the smaller PA of C moiety. The observed ratio of C:H+ to G:H+ was about 5.5:4.5. The experiments were performed using the acidic sample solution (pH = 3.1). However, the anomaly, i.e., formation of the most abundant C:H+ fragments, was still observed even when using the sample solution without the addition of acid (pH = 6.4), which suggests that proton-bound WC base pairs were not significantly produced under our experimental conditions (Figure 5S).

Figure 4
figure 4

F.Y. data as a function of N.C.E. (a.u.) obtained by ER-CID in multiple collision conditions for (C:G:H)+, (1-MeC:G:H)+, (5-MeC:G:H)+, and (1-MeC:C:H)+

To examine the intriguing CID behavior of (C:G:H)+, CID experiments were further performed in near-single collision conditions using a quadrupole tandem mass spectrometer. The CID data taken at the collision gas (Ar) pressure of 0.10 mTorr are given in Figure 5, in which it can be seen that these are essentially the same results observed in multiple collision experiments. The results of (1-MeC:C:H)+ accord well with the previous TCID experiments [35], where formation of 1-MeC:H+ (ΔPA0 > 0) was dominant over that of C:H+. Again, dissociation in the near-single collision conditions also led to pronounced formations of 1-MeC:H+ and 5-MeC:H+ for (1-MeC:G:H)+ and (5-MeC:G:H)+, respectively, which are moieties with level of PA higher than that of G. As for the CID of (C:G:H)+; the intriguing aspect of CID of producing more abundant C:H+ is still evident. Therefore, it can be suggested that there might be certain causes involved in the dissociation process of (C:G:H)+; this will require further understanding of the potential energy surface of (C:G:H)+ along the dissociation coordinates. The stronger formation of C:H+, which was a characteristic of the proton-bound Hoogsteen base pair identified by IRMPD [17], indicates that the (C:G:H)+ produced under acidic conditions in this study is indeed the Hoogsteen complex.

Figure 5
figure 5

F.Y. data as a function of Ecom (eV) obtained by ER-CID in near-single collision conditions for (C:G:H)+, (1-MeC:G:H)+, (5-MeC:G:H)+, and (1-MeC:C:H)+. The collision gas (Ar) pressure was 0.10 mTorr

Potential Energy Surfaces: Facile Proton Transfer Leading to O-Protonated C Formation

In an effort to understand the dissociation process of the proton-bound Hoogsteen base pair, we theoretically explored the potential energy surface of (C:G:H)+. The resulting potential energy diagrams are given in Figure 6, presented in terms of ΔEe and ΔG (298.15 K) from the lowest energy structure, C:H+∙∙∙G (I). The proton-bound Hoogsteen base pair is related to the IV structure by an intermolecular proton transfer from the N3-position of C to the N7-position of G with a low PT energy barrier of 4.0 kcal/mol via TSPT(N7) [17, 20, 50]. In fact, this PT offers a pathway relevant to the dissociation into the product state of C + G:H+, in which the proton is transferred to the G fragment. When a potential energy scan was performed by adiabatically elongating the intermolecular distance between the two moieties of the Hoogsteen base pair, the energy gradually increased until the product state of C(N3):H+ + G was reached, where no energy barrier for dissociation was noticeable. Both represent the two dissociation channels that were considered as the competitive pathways of dissociation for the proton-bound Hoogsteen base pair; C:H+∙∙∙G → C + GH+ and C:H+∙∙∙G → C(N3):H+ + G (Figure 1), for which the observed branching ratio was not fully understood yet [17, 49].

Figure 6
figure 6

Potential energy diagrams for (C:G:H)+ constructed in terms of ΔEe and ΔG (298.15 K). The energy values (kcal/mol) are calculated at the B3LYP/6-311+G(2d,p) theory level. The MP2/aug-cc-pVDT energies obtained at the B3LYP optimized geometries are also presented in parenthesis

On the other hand, protonation at the N3-position of C is the most important aspect in base pairing, as found in DNA i-motif and Hoogsteen base pairing. However, protonation of the O2-position in isolated C molecules is also known in the gas phase [65, 66], which is energetically even more favorable than that of N3 in C molecule (Table 1). Although the fragmentation to the product state forming C(O2):H+ may thus be possible, it was found that the dissociation to the product state of 1-MeC(O2):H+ + C was not involved in the TCID of (1-MeC:C:H)+. To produce the 1-MeC(O2):H+ fragment, the proton has to migrate from the N3-position to the O2-position of C via the collisional excitation. However, to transfer the proton from the N3 to O2-position of C in the dissociation process, the reaction needs to surmount a PT barrier of 43.4 kcal/mol, which is approximately 3 kcal/mol larger than the threshold dissociation energy to the corresponding product state. Thus, the fragmentation pathway yielding the 1-MeC(O2):H+ fragment was not available in the TCID of (1-MeC:C:H)+ [35].

In the course of exploring the potential energy surface of (C:G:H)+, we also examined the possible PT from the N3 to the O2-position of C in the dissociation of base pairs. Interestingly, in contrast to the (1-MeC:C:H)+ case, it was found there is a PT pathway with a low energy barrier of about 8 kcal/mol (TSPT(O2)), which migrates the proton from the N3-position of C in the Hoogsteen geometry to the O2-position of C in a loosely bound complex formed by a single ionic hydrogen bond (INT), through which further potential energy scan smoothly dissociated the complex to the product state of C(O2):H+ + G. When compared with the dissociation energy threshold of 40.3 kcal/mol, the TSPT(O2) is very low (8 kcal/mol). This is easily surmountable in the energy regime in which actual collision-induced dissociation can take place. In Figure 7, the potential energy diagrams for (1-MeC:G:H)+ and (5-MeC:G:H)+ are also displayed. The theoretical study also indicated facile PTs upon activation of the base pairs of (1-MeC:G:H)+ and (5-MeC:G:H)+, of which the energetics are similar to that of (C:G:H)+.

Figure 7
figure 7

Potential energy diagrams for (1-MeC:G:H)+ and (5-MeC:G:H)+ constructed in terms of ΔEe and ΔG (298.15 K). The energy values (kcal/mol) are calculated at the B3LYP/6-311+G(2d,p) theory level. The MP2/aug-cc-pVDT energies obtained at the B3LYP optimized geometries are also presented in parenthesis

This finding of the presence of an additional fragmentation pathway producing O2-protonated C, C(O2):H+, accounts for many important questions in this study. First, although mass spectrometry, which measures only ion masses, cannot characterize observed protonated C fragments, C:H+, it is most likely that both N3- and O2-protonated Cs are produced and coexist in the observed C:H+ fragments. Second, the availability of a new product channel of O2-protonated C accounts for the deviation in the order of relative stability (CID50%), (1-MeC:C:H)+ > (C:G:H)+ > (1-MeC:G:H)+ > (5-MeC:G:H)+, from that of predicted D0. With the additional accessible channel to O2-protonated C, the energy order for the lowest product channels is altered (1-MeC:C:H)+ (39.5 kcal/mol) > (C:G:H)+ (38.2 kcal/mol) > (1-MeC:G:H)+ (38.0 kcal/mol) > (5-MeC:G:H)+ (37.8 kcal/mol), which now agrees well with the CID50% results. In addition to the theoretical prediction of the lowest energy structures (Figure 2), the agreement of relative stability (CID50%) with the predicted energetics (ΔD0) supports the idea that the protonated base pairs produced by ESI in this study are indeed proton-bound Hoogsteen base pairs of C:H+∙∙∙G, 1-MeC:H+∙∙∙G, and 5-MeC:H+∙∙∙G.

Finally, the anomaly in the CID of (C:G:H)+ was the contradiction to the prediction made using two competitive channels. However, this finding of an additional reaction channel effectively doubles the number of available states in the TS to C:H+. This greatly promotes the apparent reaction rate for production of C:H+ fragments. This explains the observed anomaly in dissociation of (C:G:H)+ and pronounced production of protonated C fragments in the CID of (1-MeC:G:H)+ and (5-MeC:G:H)+ as well.

As for ER-CID of (1-MeC:G:H)+ and (5-MeC:G:H)+, formation of the most abundant (1-MeC:H)+ and (5-MeC:H)+ does not appear to be anomalous as the PAs of 1-MeC and 5-MeC are larger than that of the counterpart, G (ΔPA > 0). However, the branching ratios of [1-MeC:H+]/[G:H]+ and [5-MeC:H]+/[G:H]+ are measured to be very large, 149 and 51, respectively (Figure 4). If it is assumed that the three competitive dissociation pathways leading to formation of N-protonated C (k1), O-protonated C (k1’), and protonated G (k2) are completely independent, the basic relation of kinetic method can be extended as ln((k1 + k1’)/k2) = ln([x-MeC:H+]/[G:H+]) ≈ (ΔGk1 + ΔGk1’ – ΔGk2)/RTeff. Assuming that Teff is 200 K, the branching ratios are estimated to be very large, 131 and 78 for [1-MeC:H+]/[G:H]+ and [5-MeC:H]+/[G:H]+, respectively, using the calculated ΔGp values (Table 1S). This explains the large observed branching ratios, which further reveals the same anomalous CID behaviors for (1-MeC:G:H)+ and (5-MeC:G:H)+ as found for the proton-bound Hoogsteen base pair of (C:G:H)+.

Conclusion

In this study, ER-CID of (C:G:H)+ exhibited an anomaly in fragment abundances between two competitive fragmentation pathways of generating C:H+ and G:H+ fragments. In contrast to the other ER-CID results for (1-MeC:G:H)+ and (5-MeC:G:H)+ (ΔPA > 0), in which the protonated fragments of methylated Cs were more abundant than the common ion, G:H+, protonated C was still more pronounced in the ER-CID of (C:G:H)+, despite the lower PA of C (ΔPA < 0). However, in further theoretical study on the potential energy surfaces of the proton-bound Hoogsteen base pairs, it was found that in contrast to the triply-bound complex of (1-MeC:C:H)+, upon collisional excitation, proton transfers are feasible not only from the N3-position of C to the N7-position of G (intermolecular PT barrier of 4 kcal/mol) but also from the N3-position to the O2-position of C (intramolecular PT barrier of 8 kcal/mol). This suggests that the proton in the ionic hydrogen bond is rather mobile in the excited complexes, leading to formation of both N3- and O2- protonated Cs. This availability of an extra channel for C:H+, which suggests twofold degeneracy or increased number of available states in TS, may give rise to pronounced formation of C:H+ product. It can therefore be considered that it is the proton transfer that accounts for the apparent anomaly observed in the CID of the proton-bound Hoogsteen base pair of C:H+∙∙∙G.