Introduction

It is well recognized that prototropy in nucleobases influences the structure of nucleic acids and their replication, mutation, and degradation processes [113]. Labile hydrogens can move as protons between conjugated sites and change properties of nucleobases, particularly their ability to H-bonding. This phenomenon and the hypothesis of rare tautomers, suggested by Watson and Crick [2] in 1953 for DNA mutations, and later developed by Löwdin [3, 4] and advanced by Topal and Fresco [5], encouraged many chemists to theoretical and experimental studies on the structure of nucleobases in different environments. After six decades of research, although one can find in the literature hundred articles on tautomerism of nucleobases, this phenomenon attracts permanently the attention of scientists.

In normal DNA, cytosine (C) forms the H-bonded pair with guanine (G), and adenine (A) forms the other one with thymine (T) [1]. When the tautomeric equilibrium for one base is disturbed, the pairing may be mismatched, and mutations of single nucleotide (point mutation) may appear [2, 913]. When the point mutation in some genes is followed by changes in the sequence of amino acids, the changes may lead to serious diseases, e.g., cystic fibrosis, mucosal melanoma, colorectal cancer, lung cancer, and pancreatic cancer. Fortunately, the DNA mismatch repair system recognizes and repairs errors in DNA that appear during DNA replication, and only in exceptional cases this system fails to detect errors and a serious disease appears [14]. The reasons of failing are not yet well recognized, and the hypothesis of rare tautomers is continually verified by experimental and theoretical methods [913, 1520].

One should mention here that experimental investigations of tautomeric conversions require exceptional efforts, because prototropy is a very fast and reversible process [6, 7, 21]. Prototropy is also very sensitive to experimental conditions. Individual tautomers are very difficult to separate and to study. Very frequently, experimental techniques cannot give complete information on all possible tautomeric forms and on all possible tautomeric equilibria. The principal reason is as follows. An experiment gives the possibility to identify major tautomers, signals of which have significant intensities. In some cases, minor tautomers can also be detected. Rare tautomers are usually undetectable, probably because their amounts are too small and their signals cannot be distinguished from the background. Moreover, the experimental techniques, such as ultraviolet (UV), infrared (IR), Raman, microwave (MV), nuclear magnetic resonance (NMR), and mass spectrometry (MS), have their own limits of detection, and thus, the number of detected tautomers may be different for different methods. It may also be smaller than that of possible tautomeric forms. Quantum chemical methods have this advantage that they give the possibility to study all possible individual tautomers and all intramolecular and intermolecular interactions [6, 7, 2123]. One can also model all possible tautomeric conversions and predict all possible microscopic parameters for isolated, microsolvated and macrosolvated systems. Effects of different factors that influence stabilities of individual tautomers and their associates can also be examined. Such kind of investigations helps to understand the properties of tautomeric systems and the mechanisms of important chemical and biochemical transformations.

For our quantum chemical studies on the favored and rare tautomers of nucleic bases, we chose cytosine (Scheme 1). The reasons are as follows. Its rare tautomer may cause mutations [2]. It may form the H-bonded pair with adenine. Consequently, cytosine may be replaced by thymine during replication. The GC → AT transition seems to be the most frequent DNA mutations [2, 1013, 24, 25]. Hence, it is very important to study the favored and rare tautomers of cytosine in various environments.

Scheme 1
scheme 1

Canonical form of cytosine (C) and the structures of its convenient models, 4-amino- (4APM) and 2-hydroxypyrimidine (2OHPM). Labile protons marked in red and conjugated sites in blue (Color figure online)

Cytosine contains two labile protons and five conjugated tautomeric sites. The two protons can move according to 1,3, 1,5 and/or 1,7 proton shifts between the O7, N8, N1, N3, and/or C5 atoms. The proton transfers in tautomeric conversions are accompanied by migrations of one, two, and/or three double bonds, respectively, and no separation of the charge takes place [6, 21, 26]. Some proton transfers for cytosine are analogous to those for its structural models, 4-aminopyrimidine (4APM) and 2-pyrimidone—tautomeric form of 2-hydroxypyrimidine (2OHPM). Combinations of various types of tautomeric conversions, such as amide-iminol, amine-imine, and enamine-imine, lead to the complete tautomeric mixture for cytosine consisting of nine tautomers (Table 1). It should be noted here that the number of tautomers is a property of tautomeric systems. It depends on the number of labile protons and on the number of conjugated tautomeric sites [26]. However, relative stabilities of individual tautomers depend on various internal and external factors, which strongly influence tautomeric preferences [6, 7, 21]. For cytosine, the C2 (N1N8) isomer was called the ‘canonical tautomer’ [1]. This tautomeric form is the most frequently present in nucleic acids [1, 8, 24, 25]. The C8 (N1N3) isomer, probably responsible for the point mutations of DNA, was called the rare tautomer [2].

Table 1 Positions of labile protons for tautomers of cytosine (C), 4-amino- (4APM), and 2-hydroxypyrimidine (2OHPM)

Numerous interesting experimental and theoretical reports can be found in the literature on prototropic conversions for neutral cytosine in the gas phase. There are also some documents on microsolvated and macrosolvated cytosine. In the solid state, the canonical form C2 has been found for cytosine [27]. In the gas phase, depending on the experimental method applied (matrix isolation IR, MW, REMPI, IR laser in helium nanodroplets, MS, core-level X-ray photoemission, and near-edge X-ray absorption), two (C1 and C2) or three (C1, C2, and C8) tautomers have been detected for cytosine [2835]. Recently, five isomers of gaseous cytosine (two rotational isomers a and b of C1, one isomer of C2, and two geometrical isomers a and b of C8) have been characterized by Alonso et al. [36], who applied a laser-ablation molecular-beam Fourier-transform microwave (LA-MB-FT-MW) spectroscopy. In aqueous solution, two tautomeric forms of cytosine (C2 and C3) seem to dominate [37].

Unfortunately, the complete tautomeric mixture of cytosine has not been investigated by quantum chemical methods [3850]. Maximum six tautomers have been studied for neutral cytosine (C1C3, C5, C6, and C8), and the amide-iminol and amine-imine conversions analyzed [4042]. In some papers, even rotational isomerism of the exo –OH group and geometrical isomerism of the exo =NH group have not been considered. The enamine-imine conversions and the CH tautomers (C4, C7, and C9) have usually been neglected. Favored and rare tautomers have solely been studied for adiabatically bound valence anions of cytosine, and the importance of the C9 tautomer discussed [51]. The CH tautomer has also been found to be favored for negatively ionized 4APM [52, 53].

The hypothesis of rare tautomers and also the variability of tautomeric preferences for the adiabatically bound valence anions of cytosine and for the ionized forms of 4APM encouraged us to undertake the studies for the complete tautomeric mixture of cytosine at various oxidation states, the neutral (C), oxidized (C − e → C ), and reduced (C + e → C −·) states in the gas phase. In the literature, one-electron oxidation and one-electron reduction are also called positive and negative ionization in mass spectrometry, or electron detachment and electron attachment in photoelectron spectroscopy. In this work, tautomeric conversions and various internal effects such as substituent effects and intramolecular interactions between neighboring groups have been discussed for the neutral and redox forms of cytosine. Geometric and energetic consequences of prototropy have also been examined and compared to those observed earlier for 4APM and 2OHPM. For investigations, the DFT method [54] has been employed with the hybrid functional of Becke [55] and gradient correction of Lee et al. (B3LYP) [56] and the 6-311+G(d,p) basis set [57] as previously described for adenine and its models [52, 53, 58, 59]. The B3LYP functional has been recommended for charged radicals [60] and applied for the anionic states of nucleic bases [51]. It has also been used for geometry optimalization in the G3B3 theory [61].

Methods

Geometries of the neutral and charged forms of all possible cytosine isomers in their ground states (Fig. 1) were fully optimized in the gas phase without symmetry constraints employing the DFT(B3LYP) method [5456] and the 6-311+G(d,p) basis set [57]. The restricted B3LYP functional was used for neutral isomers, and the unrestricted B3LYP functional was applied for charged radicals. For all structures, frequencies were calculated, first to prove that the structures are minima and next to estimate the corresponding zero-point energies. Thermodynamic parameters such as the energy (E), enthalpy (H = E + pV), entropy (S), and Gibbs energy (G = H − TS for T = 298.15 K) were calculated using the same level of theory. For tautomeric conversions, the relative thermodynamic parameters (ΔE, ΔH, TΔS, and ΔG), the tautomeric equilibrium constants (as pK = ΔG/2.303RT), and the percentage contents of individual forms {x = K/(1 + K)} were estimated. The ΔG values include changes in the electronic energy, zero-point energy (ZPE), and thermal corrections to the energy and entropy (vibrational, rotational, and translational). The theoretical adiabatic ionization potential {IP = E(optimized radical cation) − E(optimized neutral)} and the theoretical adiabatic electron affinity {EA = E(optimized neutral) − E(optimized radical anion)} were calculated for the tautomeric mixture, taking the total energies of the neutral and charged forms at their respective equilibrium nuclear configurations, and their percentage contents in the tautomeric mixture were also considered. All calculations were performed according to the procedures included in the Gaussian-03 series of programs [62].

Fig. 1
figure 1

All possible isomers of cytosine. The relative Gibbs energies (in kcal mol−1 at 298.15 K) and the HOMED indices estimated for neutral (red), oxidized (blue), and reduced (rose) cytosine (nf—structure not found at the DFT level). ΔG given in parentheses. HOMED6 and HOMED8 placed inside and outside of the ring, respectively (Color figure online)

To properly determine the distribution of π- and n-electrons for all tautomers/rotamers of cytosine and to well describe the variations of electron delocalization, the geometry-based HOMED (harmonic oscillator model of electron delocalization) procedure [63, 64] was applied to the geometries optimized at the DFT(B3LYP)/6-311+G(d,p) level. The abbreviation HOMED was proposed in 2006 for the modified index [63], but it may also be abbreviated as moHOMA (modified original HOMA) or simply HOMA. The HOMA (harmonic oscillator model of aromaticity) index [65, 66], reformulated by Krygowski [67], and the HOMHED (harmonic oscillator model of heterocyclic electron delocalization) index, proposed by Frizzo and Martins in 2012 [68] and based on hypotheses of the HOMED index [64], were not applied here for cytosine. The reasons were discussed previously [53, 69]. Since the same resonance phenomenon takes place for neutral molecules, ions, and radicals [26], the HOMED indices were estimated for neutral and redox forms of cytosine using the following equation: HOMED = 1 − {α(CC) × Σ[R o(CC) − R i (CC)]2 + α(CX) × Σ[R o(CX) − R i (CX)]2}/n. In this equation, α are the normalization constants, R o are the optimum bond lengths (assumed to be realized for fully delocalized systems), R i are the running bond lengths in the tautomeric system, and n is the number of bonds taken into account. In the case of cytosine isomers, six bonds for the pyrimidine ring and eight bonds for the whole molecule, including the exo –OH/=O and –NH2/=NH groups, were taken into account. The normalization α constants for the even number of bonds were calculated from the following equation: α = 2 × [(R o − R s)2 + (R o − R d)2]−1, where R s and R d are the reference single and double bonds, respectively. The following R s, R d, and R o values (in Å), calculated at the B3LYP/6-311+G(d,p) level for the reference molecules, were taken here [64]: 1.530 (ethane), 1.329 (ethene), and 1.394 (benzene) for the CC bonds; 1.466 (methylamine), 1.267 (methylimine), and 1.334 (1,3,5-triazine) for CN bonds; and 1.424 (methanol), 1.202 (formaldehyde), and 1.281 (protonated carbonic acid) for the CO bonds. On the basis of these R values, the normalization α constants equal to 88.09, 91.60, and 75.0 were used for the CC, CN, and CO bonds, respectively [64].

Results and discussion

For the hydroxy forms of cytosine (C1 and C5C7), two rotational isomers can be considered, one with the hydroxy H atom synperiplanar to the endo N1 atom (a) and the other one with this atom synperiplanar to the endo N3 atom (b). Due to geometrical isomerism of the exo =NH group, two isomers are also possible for the imino forms of cytosine (C5C9), one with the imino H atom synperiplanar to the endo N3 atom (a) and the other one with this atom synperiplanar to the endo C5 atom (b). Taking into account all type of isomerism possible for cytosine (tautomerism, rotational, and geometrical isomerism), the structures of twenty-one neutral isomers were optimized at the DFT(B3LYP)/6-311+G** level (Fig. 1). The same number of isomers was considered for the charged radicals, and their structures optimized at the DFT level. For the thermodynamically stable structures, the relative Gibbs energies (ΔG) were calculated, and the HOMED indices estimated for the ring (six bonds—HOMED6) and for the whole tautomeric system, including the exo –NH2/=NH and –OH/=O groups (eight bonds—HOMED8). All calculated ΔG and HOMED values are given in Fig. 1.

The minima with real frequencies were found for all possible twenty-one isomers of neutral cytosine. For the amino tautomers, the exo NH2 group is not in the ring plane [43]. It has a pyramidal conformation. For the imino tautomers, transfer of labile proton(s) to the endo N atom(s) in C2, C3, C5, C6, and C8 does not destroy the planarity of the ring. Solely the CH amino (C4) and imino tautomers (C7 and C9) lose the planarity of the ring due to the presence of the C5-sp3 atom. For positively charged forms (radical cations), three isomers C7ab , C7bb , and C9a were not found. The exo =NH group in the thermodynamically stable C7aa , C7ba , and C9b isomers is out of the ring plane by the dihedral angle equal to 34, 37, and 62°, respectively. The exo =NH group is also twisted for C5ab and C5bb by 51 and 24°, respectively. The exo NH2 group is planar for all positively charged amino forms. For radical anions, this group takes the pyramidal conformation. Solely one isomer C6bb −· was not found for negatively charged cytosine. The exo =NH group in all stable negatively charged isomers is almost in the ring plane. The dihedral angle is not larger than 5°.

Positions of labile protons and oxidation states strongly influence the CC, CN, and CO bond lengths for the cytosine isomers. The effects are not parallel for neutral and redox forms. This suggests that there is no common mechanism for one-electron oxidation and separately for one-electron reduction for all cytosine isomers. An analysis of the total atomic spin densities confirms the differences. Various exo and endo heteroatoms and/or π-bonds lose one electron in cytosine tautomers. One electron is also gained by different sites. The differences have already been observed for the hydroxy and amino derivatives of azines [53]. Consequently, the HOMED indices estimated for the six-membered ring (HOMED6) as well as for the whole molecule including the exo groups (HOMED8) vary in a different way for the neutral and ionized amino-hydroxy, amino-oxo, imino-hydroxy, and imino-oxo isomers, and there is no linear relation between the HOMED indices of the neutral and ionized forms (Fig. 2).

Fig. 2
figure 2

Plots between the HOMED8 indices of neutral and ionized forms of cytosine

The HOMED8 values for neutral isomers of cytosine vary from 0.39 to 0.94. The C1a and C1b isomers have the largest HOMED values (close to unity), and they are aromatic. The C4 isomer has the lowest HOMED value, and it is non-aromatic due to the presence of the C5-sp3 atom in the ring. The HOMED indices of the other CH isomers with the C5-sp3 atom (C7aa, C7ab, C7ba, C7bb, C9a, and C9b) are only slightly larger (HOMED8 0.40–0.47) than that of C4. The canonical tautomer C2 is less delocalized (HOMED8 0.79) than the C1a and C1b isomers, but it is still aromatic. Similar π-electron delocalization has been previously reported for the canonical tautomer using various measures of aromaticity [7073]. The N1 atom in C2 taking the labile proton retains its planarity. The nπ conjugation in the six-membered ring of C2 is similar to that for the five-membered ring in pyrrole, imidazole, etc. [64].

One-electron oxidation dramatically changes bond lengths and electron delocalization for all cytosine isomers. Consequently, the geometry-based indices (HOMED6 and HOMED8) also change. They decrease for the CH–NH isomers (C4 and C9b ) even to zero, while they increase for the CH–OH ones (C7aa and C7ba ) to 0.73–0.74. They also increase for the imine NH–OH isomers (C5aa , C5ab , C5ba , C5bb , C6aa , C6ab , C6ba , and C6bb , HOMED8 ≥ 0.89) and for the canonical NH–NH form C2 (HOMED8 0.87). However, they slightly decrease for the NH–OH isomers C1a and C1b (HOMED8 0.92 and 0.93, respectively). One-electron reduction causes also different effects. For negatively ionized cytosine isomers, the HOMED8 values vary solely from 0.49 to 0.80. For the canonical form C2 −·, the HOMED8 value slightly decreases (to 0.70) when compared to the neutral form. In higher degree, the HOMED8 values decrease for the NH–OH isomers C1a −· and C1b −· (0.66). Interestingly, the HOMED8 values increase for the CH tautomers (0.49–0.55). Similar tendencies for the HOMED indices have been observed for model compounds [52, 53].

The relative entropy term (TΔS) values are not very large (±1 kcal mol−1) for neutral and charged isomers of cytosine, similar to other tautomeric systems [21, 52, 53, 58, 59]. This indicates that all tautomeric conversions for cytosine (Scheme S1, Supplementary Material) are isoentropic processes in the gas phase. There are no large structural changes during tautomerization. Some exceptions are those for the amine and imine CH tautomers. They are a consequence of the ring planarity loss. The relative thermal corrections for all tautomeric conversions are also close to zero, and ΔE ≅ ΔH ≅ ΔG. These observations suggest that the relative thermodynamic parameters depend very little on temperature. When proceeding from 0 to 298.15 K, orders of the relative energies for major isomers do not change for neutral and charged cytosine. Generally, proton transfer reactions are isoentropic for almost all organic acids and bases [74, 75]. The relative entropies and the relative thermal corrections are close to zero.

First, perusal of the relative Gibbs energies calculated for neutral isomers of cytosine gives the possibility to distinguish the following characteristic properties for major, minor, and rare tautomers. All CH isomers (C4, C7, and C9) can be considered as very rare forms for neutral cytosine (ΔG ≫ 10 kcal mol−1). Indeed, the proton transfers from the N1, N3, O7, or N8 atom to the C5 atom are very unfavorable processes, and all CH isomers can be neglected for neutral cytosine. Analogous conclusions have been derived for model compounds, 4APM [52] and 2OHPM [53], using the same level of theory. The CH tautomers (4APM4 and 2OHPM4) can be neglected for neutral 4APM and 2OHPMG ≫ 10 kcal mol−1). The imino OH–NH forms (ΔG ≫ 10 kcal mol−1) containing one labile proton at the O7 and the other one at the N1 (C5) or N3 atom (C6) can also be considered as very rare forms, and thus, they can also be neglected for neutral cytosine. They are analogous to the imino NH forms of 4APM (4APM2 and 4APM3) [52]. One exception is the imino NH–NH form with labile protons at the N1 and N3 atoms (C8), typical for the pyrimidine bases (uracil, thymine, and cytosine). Its two rotamers a and bG 2–4 kcal mol−1) can be considered as minor forms rather than rare isomers for neutral cytosine (Scheme 2). The amino C2 and C3 (NH–NH) tautomers with one labile proton at the N1 and N3 atoms, respectively, are analogous to 2OHPM2 and 2OHPM3. For model compound, 2OHPM2 and 2OHPM3 are identical. They have identical thermodynamic stabilities and the same energies [53]. However, the presence of the exo NH2 group at the 4-position in cytosine strongly differentiates relative stabilities of C2 and C3. The C3 tautomer can be considered as rare amino form (ΔG 7 kcal mol−1), whereas the C2 tautomer is the favored ‘canonical’ form for neutral cytosine at the B3LYP level (ΔG 0 kcal mol−1). The two rotamers a and b of the amino C1 (NH–OH) tautomer with labile proton at the O7 atom are major amino forms (ΔG ≤ 2 kcal mol−1).

Scheme 2
scheme 2

Favored redox processes for cytosine estimated at the DFT level

The presence of all five major and minor isomers (C1a, C1b, C2, C8a, and C8b) in the tautomeric mixture of neutral cytosine has been experimentally proven [36]. Our DFT calculations are in good agreement with those reported in the literature [3942, 47, 49]. However, it should be mentioned here that the percentage contents of major and minor forms depend on the level of calculations [3850] as well as on the experimental method applied [3036]. For example, the DFT methods predict the canonical tautomer C2 as the favored form for neutral cytosine, whereas the HF, MPn and CC methods indicate the C1a isomer. Table S1 (Supplementary Material) lists some selected DFT, HF, MPn, and CC data for the five major and minor tautomers of cytosine. Differences in the predicted ΔE values are not larger than 2 kcal mol−1. On the other hand, Brown et al. [31], using the MW spectroscopy to jet-cooled cytosine with a nozzle temperature of 568 K, estimated the populations of C2 and C1a to be in the approximate ratio 1:1 in the gas phase. Alonso et al. [36], who applied the LA-MB-FT-MW spectroscopy, found recently a slight predominance of C1a and concluded that ab initio calculations did not reproduce well the experimental observations. When compared to thymine and uracil, cytosine is a very complex molecular system. The common conclusion on its major tautomers (C2 and C1a) in the gas phase has not yet been formulated. The microhydration by one, two, or three water molecules [42, 45, 46] and the macrohydration by using the continuum models [40, 41, 45, 48, 49] favor the ‘canonical’ tautomer C2.

One-electron oxidation changes the relative energies of all individual isomers (Fig. 1), and consequently, it changes the composition of the tautomeric mixture (Scheme 2). The tautomeric mixture of oxidized cytosine consists of four major forms (C1a , C1b , C2 , and C3 , ΔG < 1 kcal mol−1), three minor forms (C6ab , C8a , and C8b , ΔG 2–4 kcal mol−1), and three rare forms (C5ba , C5bb , and C6aa , ΔG 6–10 kcal mol−1). The other isomers have ΔG larger than 10 kcal mol−1. They may be neglected in the tautomeric mixture of positively ionized cytosine. Most dramatical changes in the tautomeric mixture of cytosine are caused by one-electron reduction. The CH isomers C9a −· and C9b −·, very rare forms for neutral and oxidized cytosine, become the favored structures for reduced cytosine. The Gibbs energy of the reduced canonical form C2 −· is larger by ca. 1 kcal mol−1 than that of C9b −·, and those of C1a −· and C1b −· are larger by more than 10 kcal mol−1 than that of C9b −·. The tautomeric mixture of reduced cytosine consists of three major forms (C2 −·, C9a −·, and C9b −·, ΔG < 2 kcal mol−1), two minor forms (C3 −· and C8b −·, ΔG 4–5 kcal mol−1), and three rare forms (C7aa −·, C7ba −·, and C8a −·, ΔG 7–9 kcal mol−1). The other isomers have ΔG values larger than 10 kcal mol−1. They may be neglected for negatively ionized cytosine.

The variations of the composition of the tautomeric mixture when proceeding from neutral to redox cytosine (Scheme 2) seem to originate from a combination of the analogous variations observed for model compounds, 4APM (Scheme S2) and 2OHPM (Scheme S3) estimated at the same level of theory. The aromatic amine tautomer predominates in the gas phase for neutral and oxidized 4APM, whereas the non-aromatic CH imino isomer is favored for reduced 4APM [52]. For 2-hydroxypyrimidine, the oxo tautomers dominate in the tautomeric mixture for neutral, oxidized, and reduced forms [53]. Moreover, direct comparison of the DFT calculated thermodynamic parameters for the oxidized and neutral isomers of 4-aminopyrimidine, 2-hydroxypyrimidine, and cytosine indicates that one-electron oxidation is very endothermic process and requires ca. 200 kcal mol−1. One-electron reduction is more profitable process than one-electron oxidation and requires considerably lower energy. Both model compounds and cytosine may take spontaneously one electron from a reducing agent.

There are no experimental data in the literature for the ionization potential (IP) and for the electron affinity (EA) of model compounds for comparison [74, 75]. However, it should be mentioned here that the literature IPs for 4-aminopyridine (8.8 eV [76]), 2-aminopyridine (8.5 eV [76]), and unsubstituted pyrimidine (9.3 eV [75]) are of the same order of magnitude as the DFT estimated adiabatic IPs for 4-aminopyrimidine (8.8 eV) and 2-hydroxypyrimidine (9.1 eV). On the other hand, the literature EA for unsubstituted pyrimidine is not very large: EA (electron transmission spectroscopy) –0.25 eV [77] and EA (G2MP2B3) –0.17 eV [74]. Taking the composition of the tautomeric mixture for neutral and ionized cytosine into account, one can estimate the IP (8.6 eV) and EA (0.1 eV) values in the gas phase for the favored ionization processes at the DFT level (Scheme 2). They are close to those for model compounds (Schemes S2 and S3). The experimental IP value for cytosine (8.45 eV [78]) is not very different from the DFT one. Schiedt et al. [79] found two EA values for the cytosine dipole bound state, one for the amino-oxo tautomer (0.23 eV) and the other one for the amino-hydroxy tautomer (0.08 eV). For the valence state of the rare CH tautomer, Li et al. [51] estimated larger EA value (2.34 eV).

Comparison of the relative Gibbs energies estimated at the DFT level for cytosine isomers with those estimated at the same level of theory for analogous isomers of 2OHPM and 4APM gives the possibility to estimate the total energetic effects (δG) of the exo NH2 and OH groups. These effects, δG(NH2) and δG(OH), include the classical inductive and resonance electronic substituent effects of the exo NH2 and OH groups and also the additional internal effects being a consequence of specific favorable and unfavorable interactions of the exo with endo neighboring groups. The δG(NH2) and δG(OH) values can be estimated for neutral and redox forms of cytosine for selected tautomeric conversions, which are analogous to those for model compounds.

The δG values of the exo NH2 group were found (Table 2), when cytosine was considered as the amino derivative of 2OHPM. Proceeding from 2OHPM to cytosine, the total energetic effects of the exo NH2 group (δG) on the iminol-amide and iminol-iminon conversions were estimated as differences between the relative Gibbs energies of analogous tautomeric conversions for cytosine and 2OHPMG(NH2) = ΔG(cytosine) − ΔG(2OHPM)}. For neutral forms, larger effects (4–5 kcal mol−1) occur for the C1 → C3 and C1 → C4 conversions than for the C1 → C2 one (δG ≤ 3 kcal mol−1). The reasons are as follows. For C4, the NH2 group can interact unfavorably with the C5H2 group. For C3, the NH2 group can interact unfavorably with the N3H and C5H groups, whereas for C2 an interaction between the NH2 group and the N3 atom can be favorable. For C1a and C1b, interactions between the OH group and the N1 or N3 atom are favorable. A similar tendency occurs for the reduced forms, for which one excess electron is taken by the ring for 2OHPM and cytosine. The δG(NH2) values are larger for the C1 → C3 and C1 → C4 conversions (δG > 2 kcal mol−1) than for the C1 → C2 one (δG 0.1 kcal mol−1). However, one-electron oxidation strongly increases the total effects of the exo NH2 group (δG ≥ 7 kcal mol−1). This group is favored for one electron loss in cytosine.

Table 2 Total energetic effects of the exo NH2 group (δG in kcal mol−1) on selected tautomeric conversions for cytosine analogous to those for 2OHPM

When cytosine was considered as the hydroxy derivative of 4APM, the total energetic effects of the exo OH group (δG) were calculated for the amine-imine and enamine-imine conversions. Table 3 summarizes the δG(OH) values for the conformations a and b of the OH group. They were estimated as differences between the relative Gibbs energies of analogous tautomeric conversions for cytosine and 4APMG(OH) = ΔG(cytosine) − ΔG(4APM)}. The largest effects (δG > 5 kcal mol−1) occur for tautomeric conversions of neutral isomers, for which the OH group interacts unfavorably with the N1H (C5aa and C5ab) or N3H group (C6ba and C6bb). When intramolecular interactions between neighboring groups are favorable, the total effects of the exo OH group are considerably lower, and the δG values do not exceed 4 kcal mol−1. A similar tendency and slightly larger δG values are found for oxidized cytosine. For reduced cytosine, variations of the δG values are not parallel to those for neutral and oxidized cytosine. However, the δG values are not larger than 8 kcal mol−1.

Table 3 Total energetic effects of the exo OH group (δG in kcal mol−1) on selected tautomeric conversions for cytosine analogous to those for 4APM

A relation between prototropy and electron delocalization has been signaled more than 50 years ago by Pauling [26]. This relation has been recently discussed for some simple tautomeric systems [21]. Good linear relationships have been found for the neutral aromatic NH and non-aromatic CH tautomers of imidazole and purine, which have no substituent, and which possess solely the endo functional groups [52, 53, 8083]. Prototropy is also well related to electron delocalization for the neutral NH and CH tautomers of aminoazines and for the NH–NH and NH–CH tautomers of adenine [52, 53, 81]. Aminoazines and adenine contain not only the endo aza group(s) but also the exo –NH2/=NH group. For particular isomers, the exo group can interact intramolecularly with the endo functional groups. Since these internal effects are less important factors than aromaticity, they only slightly perturb the relation between prototropy and electron delocalization for neutral aminoazines and adenine.

Quite a different situation takes place for hydroxyazines and uracyl, which contain the exo –OH/=O functional group(s) [53]. The relation between prototropy and electron delocalization seems to be more complex. The energetic parameters (ΔG), which measure prototropy, are not parallel to the geometric ones (HOMED), which measure electron delocalization. We observe similar tendency for cytosine, which possesses one exo –OH/=O group (Fig. 3). Stability of functionalities seems to be more important factor than aromaticity and seems to dictate the tautomeric preferences for neutral cytosine. Intramolecular interactions between the exo and endo groups influence also the conformational and configurational preferences for the hydroxy and imino tautomers. For ionized cytosine, the HOMED8/ΔG plots are more complex. There is no common HOMED8/ΔG relationship for radical cations nor for radical anions.

Fig. 3
figure 3

HOMED8/ΔG plots for neutral (red points), oxidized (blue points), and reduced (rose points) cytosine (Color figure online)

However, when cytosine is considered as the 4-amino derivative of 2-hydroxypyrimidine (2OHPM), variations of the HOMED indices estimated for the whole tautomeric system of neutral cytosine isomers and those of their relative Gibbs energies (ΔG) are almost analogous to those for neutral 2OHPM (Fig. 4). The iminol-amide and iminol–iminon conversions for cytosine cause geometric changes similar to those for 2OHPM [53]. Some differences take place for energetic parameters, particularly for the iminol-amide conversions, i.e., between the C2 and C3 isomers of cytosine and 2OHPM2/2OHPM3.

Fig. 4
figure 4

Plots between the HOMED indices and the ΔG values (in kcal mol−1) estimated for neutral 2-hydroxypyrimidine (2OHPM) isomers (gray points) and for analogous neutral isomers in cytosine (red points) considered as the 4-amino derivative of 2OHPM (Color figure online)

Interestingly, when cytosine is considered as the 2-hydroxy derivative of 4-aminopyrimidine, a plot between the HOMED indices estimated for the whole tautomeric system (eight bonds) and the relative Gibbs energies calculated for the neutral C1a, C1b, C5aa, C5ab, C5ba, C5bb, C6aa, C6ab, C6ba, C6bb, C7aa, C7ab, C7ba, and C7bb isomers is similar to that found previously [52] for the corresponding neutral 4AMP isomers (Fig. 5). The amine-imine and enamine-imine conversions for neutral cytosine cause almost parallel changes of the geometric (HOMED) and energetic (ΔG) parameters. Some stronger deviations of points for cytosine than those for 4APM may result from intramolecular interactions (favorable or unfavorable) between the exo OH and endo N1/N1H or N3/N3H group in two different conformations (a and b) of the OH group. These interactions influence stronger the relative Gibbs energies than the HOMED indices. For example, differences between the ΔG values for the rotational isomers C5aa and C5ba, C5ab and C5bb, C6aa and C6ba, and C6ab and C6bb are close to 7–10 kcal mol−1, whereas those between the HOMED8 values are considerably lower and close to 0.01–0.05. Rotation of the exo =NH group and consequently intramolecular interactions (favorable or unfavorable) between this group and the endo N3/N3H or C5H i (i = 1 or 2) group cause lower energetic effects (≤5 kcal mol−1) than rotation of the OH group. Geometric effects (0.00–0.04 HOMED units) are not very large for the exo =NH group. They are similar to those for the exo OH group.

Fig. 5
figure 5

Plots between the HOMED indices and the ΔG values (in kcal mol−1) estimated for neutral 4-aminopyrimidine (4APM) isomers (gray points) and for analogous neutral cytosine isomers (red points), considered as the 2-hydroxy derivative of 4APM (Color figure online)

Conclusions

The DFT calculations performed for the complete tautomeric mixture of neutral and redox cytosine consisting of all possible twenty-one isomers show clearly that prototropy affects weaker the geometric parameters than the energetic ones. When proceeding from the neutral to oxidized (positively ionized) or reduced (negatively ionized) form of cytosine, the composition of the tautomeric mixture changes significantly (Scheme 2). The rare tautomers for neutral cytosine become the favored ones for oxidized and reduced cytosine. These variations seem to originate from those observed earlier for models, 4-amino- and 2-hydroxypyrimidine [52, 53].

There is no good relation between the geometric (HOMED) and energetic (ΔG) parameters estimated for individual isomers of cytosine (Fig. 3). This indicates that aromaticity is not the main factor that influences the tautomeric preferences. High stability of the amide form for cytosine is similar to that for 2-pyrimidone [53]. It destroys the relation between the geometric and energetic parameters for the amide-iminol conversions (Fig. 4). A good relation exists solely for the amine-imine and enamine-imine conversions (Fig. 5). It is similar to that found previously for 4-aminopyrimidine [52]. Deviations of some points result from intramolecular interactions between neighboring groups.

For cytosine included in the DNA acid, solely seven isomers may be considered (C2, C5aa, C5ab, C5ba, C5bb, C8a, and C8b). The DFT calculations clearly show that the C8 tautomer, proposed by Watson and Crick as the rare form [2], is present in the tautomeric mixture of cytosine at each oxidation state. The C8 tautomer may be responsible for the point mutation of DNA. Four isomers of C5 may be neglected in the tautomeric mixture. Further studies for the 1-alkyl derivative of cytosine may give some quantitative estimation of the contribution of C2 and C8 in the DNA nucleotide when oxidizing or reducing agents appear in living organisms.