1 Introduction

The environmental fluctuations and reactive alterations experienced by nucleic acids (Fig. 1) are both wide ranging and complex including, but not limited to: base flipping [1], methylation [2,3,4,5,6], depurination [7, 8] and depyrimidination [9], and ligand binding [10]. These changes to the local environment occur both in concert with, and independent of, the expected changes experienced throughout the normal function of nucleic acids; namely, the \(\pi \)-stacking interactions observed upon accumulation into a strand of DNA [11,12,13,14,15,16], and the hydrogen bonding necessary for combining two strands into the double helix [17,18,19,20]. The insights gained from the ability to probe the behaviour of nucleic acids throughout the cell are both wide ranging and of incredible importance.

Fig. 1
figure 1

Schematic structures of Watson–Crick base pairs between canonical bases

Due to its sensitivity and straightforward application, fluorescence spectroscopy has remained one of the most used tools for analysis, particularly in biomolecular systems [21,22,23,24,25,26,27,28,29]. Initially, application of fluorescence spectroscopy centred around the use of dyes, commonly involving cyanine or rhodamine moieties, attached to native nucleosides [30]. While these dyes were effective in visualising the location of bases within the cell, these compounds displayed poor sensitivity to local base–base interactions due to the use of large linkers separating the dye moiety from the base.

The advent of these methodologies, coupled with the lack of dye sensitivity to environmental changes, has created a niche role for environmentally sensitive fluorescent nucleobase analogues (FBAs) to act as powerful tools for the investigation of the structure, dynamics, and environmental properties of nucleic acids [31,32,33]. The proposed advantage in the design of the analogues over their dye-based predecessors is that, given the lack of mobility provided by the long linker, analogues would have a well-defined geometry relative to the local DNA structure.

The probes created from these analogues (Fig. 2) can be considered to occupy one of a number of categories, ranging from: (i) those closely resembling the native bases they aim to replace to maintain the \(\pi \)-stacking and hydrogen bonding characteristics, such as 2-amino-purine (2AP) [30, 34]; (ii) molecules employing pteridine moieties [35], as seen with 6-MI [36] and 6MAP [37], which have seen use as highly fluorescent analogues for the purine bases (Fig. 1); (iii) bases which have been extended to include a conjugated, primarily aromatic, moiety such as that observed in 6AzaO and 6AzaS [33]; (iv) structures in which fluorescently active aromatic moieties are linked directly to the native base, as observed in the adenine analogues, pA [38], qA, qAN1, and qAnitro [39,40,41], the cytosine analogues tC \(^O\) [42], tC [30, 43] and \(^{DMA}\)C [44], as well as the thymine analogue \(^{diox}\)T [45].

Fig. 2
figure 2

Schematic structures of fluorescent nucleobases base analogues selected for study

The sensitivity of base analogues to their environment has been been shown to vary; while 2AP shows moderate environmental sensitivity [30, 46,47,48,49,50,51] tC and tC \(^O\) have been shown to be relatively insensitive to their microenvironment [30, 52,53,54]. Regardless of the degree of environmental sensitivity attributed to a given analogue, the ideal utilisation of these compounds is in the in vivo study of the cellular environment. This creates a number of challenges in the successful design of analogues: while many present high quantum yields, they commonly exhibit significantly lower absorption cross sections when compared to dye-based probes; additionally, the absorption of most analogues takes place at higher wavelengths, often in the UV region, which can lead to undesirable photobleaching effects [55,56,57,58,59]; this high energy absorption presents a further challenge in the tissue penetration within biological media, which is particularly poor [60, 61].

One method that has been suggested to address these challenges is through the use of multiphoton absorption [62]. Multiphoton absorption enables increased tissue penetration by allowing for a longer wavelength to be used while also reducing photobleaching of out of focus chromophores due to the increased three-dimensional localisation afforded. Additional advantages in the use of multiphoton absorption are: the ability to target excited states and, as such, areas of the excited state spectrum that are not accessible in one-photon absorption, as well as reducing background fluorescence from secondary chromophores and optical components [63].

A number of analogues have previously been investigated experimentally for two-photon viability, including: 2AP [30, 64], 6-MI [64, 65], 6MAP [66], ABN [67], pA [38, 38, 68] and tC [30]. This paper looks to expand this investigation to a wider range of commonly utilised analogues (Fig. 2), assessing the two-photon profile of the isolated analogues as well as quantifying the environmental sensitivity of each analogue upon formation of: (i) hydrogen-bonded dimers such that Watson–Crick pairs (Fig. 1) are conserved; and ii) \(\pi \)-stacked dimers with each of the native bases. Non-Watson–Crick base pairs and \(\pi \)-stacked dimers involving more than one analogue have not been accounted for here, nor have the effects of \(\pi \)-stacked trimers in which an analogue is sandwiched between two native bases; this limitation was selected due to the scaling of the computational methods applied throughout the study. For ease of discussion, throughout this paper a shorthand nomenclature for the dimer structures has been utilised such that the hydrogen-bonded dimer between adenine and thymine will be represented as A-T while the \(\pi \)-stacked dimer will be represented by A-\(\pi \)-T. When discussing situations covering more than one of the canonical bases the symbol B will be used such that B-\(\pi \)-2AP represents the \(\pi \)-stacking interaction of the 2AP analogue and any canonical base.

2 Theoretical methods

Geometry optimisations and linear response, or time-dependent (TD), DFT calculations for the one-photon spectra, were carried out with the Gaussian09 software package [69]; orbitals and surfaces where visualised using the GaussView5 interface [70].

Optimisations utilised the \(\omega \)B97X-D functional [71], which has been shown to provide strong results for \(\pi \)-stacking interactions [72,73,74,75], along with the cc-pVTZ basis set [76,77,78]; determination of structural minima was conducted through frequency analysis and noted by the presence of only positive curvature; the use of this model chemistry has been well established for giving appropriate geometry determinations for molecules of this type [79,80,81,82,83,84]. Geometry selection for hydrogen-bonded dimers was determined so as to mimic the hydrogen-bonded structures of correctly orientated Watson–Crick dimeric pairs.

Geometries corresponding to \(\pi \)-stacked minima were determined through optimisation from a starting point in which the plane of each base was 3.4 Å apart and the twist of \(\pi \)-stacked bases, determined by the methylated nitrogen of each base representing the contact point for the phosphate backbone, was set to best approximate the structure of a B-form DNA strand and assuming the formation of a Watson–Crick pair upon hydrogen-bonding, as utilised and recommended in previous research [85, 86]; geometries where then allowed to relax through optimisation and, again, minima were determined through frequency analysis. Analysis of each structure at the determined minima showed minimal deviation from the 3.4 Å starting point (Table S1), with an average deviation of 0.05 Å. Solvent effects for all optimisations were accounted for through use of the PCM solvent model [87, 88] to approximate an aqueous environment.

TD-DFT data, for the characterisation of excited state character and determination of orbital contributions, were determined using a range of DFT functionals, namely: CAM-B3LYP [89], B3LYP [90, 90,91,92,93,94,95], BLYP [90, 93, 95], BP86 [96, 97] and PBE0 [98], each with the Dunning cc-pVTZ basis [99]. BLYP values are reported here due to their agreement with the experimental UV–Vis spectra of a range of bases and analogues studied throughout this work [37, 38, 38,39,40,41,42, 44, 45, 53, 68, 100, 101] when compared to the other model chemistries tested; average errors for the functionals considered, when compared to the experimentally reported values, in the determination of two-photon energies were found to be: CAM-B3LYP = 0.62 eV; B3LYP = 0.29 eV; BLYP = 0.14 eV. Examples of functional performance are observed in Fig. 3 showing that, while the B3LYP functional does particularly well at determining the absorption of the pA analogue (error = 0.15 eV), this is offset by significantly worse performance for other analogues, such as 2AP (error = 0.32 eV). While the BP86 and BLYP functionals were found to perform equivalently, BLYP was selected due to its availability for QR-DFT calculations.

The use of the one-photon UV–Vis spectra and determination of two-photon energies for absorption maxima were chosen as the criteria for benchmarking due to the reliability of density functional methods in reproducing these values at a quantitative level, particularly in relation to the excitation energies of the states to be investigated. This is in comparison to attempting to benchmark directly against the experimental two-photon cross section, where QR-DFT values can be commonly found to fall within \(\times \)8–\(\times \)10 larger than the experimental values; this is partially due to the inability of the calculations, including only an implicit solvent model, to mimic the complex environment found in in vitro experimentation which commonly result in a significant reduction in photoactivity compared to an isolated molecule. However, while the potential discrepancy in cross-section values should be noted, the model chemistries applied here are still capable of providing significant insight into the excited state behaviour of these compounds and their potential viability for TPA applications through identification of states with promising cross sections.

Direct comparison with experimental TPA cross-section values (Tables 2, 3, and S2) present a consistent over-estimation of the cross-section value, as expected; this over-estimation ranges from \(\times \)2 to \(\times \)9, a larger range of agreement than generally found. Additionally, calculations appear to consistently, and perhaps unsurprisingly, identify a number of areas of predicted high TPA absorption at higher energies than commonly investigated during experiment.

Fig. 3
figure 3

Calibration of the functional selected based on the determination of the one-photon spectra of 2AP (experimental maxima = 310 nm [30, 46,47,48,49,50,51]) and pA (experimental maxima = 390 nm [38, 68])

The use of quadratic response (QR) DFT calculations, as utilised here, has been well established in providing high accuracy values for similar systems [102,103,104,105,106,107,108]. Additionally, BLYP values were also shown to present good agreement in the identification of TPA high cross-section states when compared to experimental values for 2AP [30, 64], 6MAP [66], 6-MI [64, 65], pA [38, 38, 68], and tC [30].

Solvent effects during QR-DFT calculations were accounted for using the COSMO approximation to model an aqueous environment as implemented through FixSol in Dalton2020

[109,110,111,112]. An excited state threshold of 5.0 eV (\(\approx \) 250 nm) was applied in the reporting of data to limit the number of states in question to those that could reasonably be accessed in various environments, taking into account issues arising from the lack of tissue penetration for in vivo environments specifically [60, 61].

Within the Quadratic Response (QR) formalism, the TPA energies and \(\delta ^{TP}\) values can be solved for directly through the use of the single residue of the quadratic response function:

$$\begin{aligned} \lim \limits _{\omega _c \rightarrow \omega _f}(\omega _c - \omega _f)\left<\left<\hat{\mu _a};\hat{\mu _b}\hat{\mu _c}\right>\right>_{\omega _{f/2},\omega _i} = -T_{ab}^{2\omega ,f}\left\langle f\right| \hat{\mu _c}\left| 0\right\rangle . \end{aligned}$$
(1)

Utilising an implicit summation over an infinite number of states, the two-photon probabilities (\(\delta \)(\(\omega \))) can be obtained in the form of single residues such that:

$$\begin{aligned} \lim \limits _{\omega _c \rightarrow \omega _{f}} (\omega _c - \omega _{f})\left<\left<\mu _a;\mu _ b\mu _c\right>\right>_{-\omega _b\omega _c} = \delta (\omega )\left\langle f\right| \mu _c\left| 0\right\rangle \end{aligned}$$
(2)

in which \(\delta (\omega )\) can be expressed as:

$$\begin{aligned}&-\sum _{X}\left[ \frac{\left\langle 0\right| \mu _a\left| X\right\rangle \left\langle X\right| \mu _b - \left\langle 0\right| \mu _b\left| 0\right\rangle \left| Y\right\rangle }{(\omega _{0X}-(\omega _{0Y}-\omega _{0b}))-i\Gamma _{X0}}\right. \nonumber \\&\left. + \frac{\left\langle 0\right| \mu _b\left| X\right\rangle \left\langle X\right| \mu _a - \left\langle 0\right| \mu _a\left| 0\right\rangle \left| Y\right\rangle }{(\omega _{0X}-\omega _{0b})-i\Gamma _{X0}}\right] , \end{aligned}$$
(3)

where \(\mu _a\), \(\mu _b\), and \(\mu _c\) represent the dipole moments produced in response to an electric field of frequency \(\omega _a\), \(\omega _b\), and \(\omega _c\), respectively. It is these dipole moments that can then be utilised to determine \(\delta ^{TP}\) values.

Two-photon cross sections (\(\sigma ^{TP}\)) are reported here, as defined through QR-DFT implemented in Dalton2020 [109, 110]. Reported values are determined by the equation:

$$\begin{aligned} \sigma ^{TP} = \frac{8\pi ^3\alpha ^2\hbar }{e^4}E^2 \delta ^{TP}, \end{aligned}$$
(4)

producing values in a.u, defined as 10\(^{-50}\) cm\(^4\) s photon\(^{-1}\), equating to Goeppert–Mayer units (GM), where:

$$\begin{aligned} E=\frac{\omega }{\Gamma }, \end{aligned}$$
(5)

such that \(\omega \) is the photon energy in eV and \(\Gamma \) is a broadening factor of 0.1 eV [113]; while uncertainty in the broadening factor can create an argument of the use of values larger, or smaller, by a factor of 2, the value chosen here is consistent with previous work [35, 114, 115] and in the prediction of the two-photon line shape of a number of other molecules [113, 116,117,118,119]. \(\alpha \) is the fine line constant, and the transition strength (\(\delta ^{TP}\)) is given by [102]:

$$\begin{aligned} \delta ^{TP} = F\delta ^F + G\delta ^G + H\delta ^H, \end{aligned}$$
(6)

in which F, G, and H vary depending on the polarisation of light used; under parallel linearly polarised light, \(F=G=H=2\). In addition, each component (\(\delta ^F/\delta ^G/\delta ^H\)) takes the form:

$$\begin{aligned} \delta ^F = \frac{1}{30} \sum _{\alpha ,\beta } S_{\alpha \alpha }S_{\beta \beta }, \end{aligned}$$
(7)
$$\begin{aligned} \delta ^G = \frac{1}{30} \sum _{\alpha ,\beta } S_{\alpha \beta }S_{\alpha \beta }, \end{aligned}$$
(8)
$$\begin{aligned} \delta ^H = \frac{1}{30} \sum _{\alpha ,\beta } S_{\alpha \beta }S_{\beta \alpha }, \end{aligned}$$
(9)

where \(\alpha \) and \(\beta \) are the components of the dipole operator such that \(\alpha ,\beta = x,y,z\), and the sums contained within each term run over combinations of these components for the operators acting between the ground (\(\left| 0\right\rangle \)), intermediate (\(\left| i\right\rangle \)), and final (\(\left| f\right\rangle \)) states, such that:

$$\begin{aligned} S_{\alpha ,\beta } = \sum _{i} \frac{\left\langle 0\right| \mu ^{\alpha }\left| i\right\rangle \left\langle i\right| \mu ^{\beta }\left| f\right\rangle }{\omega _i - \frac{\omega _f}{2}} + \frac{\left\langle 0\right| \mu ^{\beta }\left| i\right\rangle \left\langle i\right| \mu ^{\alpha }\left| f\right\rangle }{\omega _i - \frac{\omega _f}{2}}, \end{aligned}$$
(10)

where \(\omega _i\) is the transition frequency to the intermediate (virtual) state, and \(\omega _f\) is the transition frequency for the final state, i.e., the state in question.

Due to the nature of the denominator in Eq. 10 (\(\omega _i - \frac{\omega _f}{2}\)) there emerges the possibility, when the calculation of large numbers of states is required, that artificially enhanced values may be observed when \(\omega _i - \frac{\omega _f}{2} \rightarrow 0\). This effect is most commonly observed when a compound possesses a low lying state (e.g., \(S_1\)) at close to half the excitation energy of a higher lying state; this can be seen when considering the T-qAnitro dimer (Table S6) in which the \(S_1\) and \(S_2\) states (1.94 eV and 2.27 eV) present high resonance with states \(S_{16}\) and \(S_{28}\) (3.94 and 4.53 eV, respectively). While the presence of this artificial enhancement does result in a reduction of the quantitative accuracy/reliability in the description of these specific states, the qualitative inference that they represent the location of an experimentally accessible state, is still reliable.

3 Results and discussion

Table 1 QR-BLYP/cc-pVTZ state energies (eV [nm]) and two-photon cross sections (GM) of canonical bases and their Watson–Crick base paired dimers
Table 2 QR-BLYP//cc-pVTZ state energies (eV [nm]) and two-photon cross sections (GM) of quadracyclic adenine (qA) family of purine analogues
Table 3 QR-BLYP//cc-pVTZ state energies (eV [nm]) and two-photon cross sections (GM) of purine analogues
Table 4 QR-BLYP//cc-pVTZ state energies (eV [nm]) and two-photon cross sections (GM) of pyrimidine analogues

3.1 Bases vs analogues

Calculations on each of the canonical bases (Table 1) show only a few states present below the 5 eV threshold, with each of these states presenting a negligible TPA cross-section value. It is, therefore, relatively clear that application of TPA methodologies to the canonical bases directly is not an applicable strategy. However, the lack of interference from the canonical bases in regards to a competing TPA spectra can be considered an added benefit to the use of analogue probes within a larger DNA structure.

In contrast, a key feature of the analogues (Tables 2, 3, and S2) is the presence of accessible states at significantly lower energies compared to their canonical counterparts.

Amongst the purine analogues, a distinction can be made between the behaviour of the analogues that closely resemble the structures of the canonical purine bases (Table 3), and those such as qA (Table 2), which show a more significant structural deviation from the canonical base they emulate.

Analogues showing high structural similarity with their base counterpart (Table 3) show a similar, yet red-shifted, spread of accessible states when compared to adenine and guanine. While significantly more states are present below the 5 eV threshold, very few of these states present a reasonable cross section when compared to both the canonical bases, and other analogues with the highest cross section being assigned to \(\hbox {S}_7\) of 6-MI. However, this state lies at 4.56 eV (\(\approx \) 272 nm), too high for reasonable use within a biological environment.

The spectra of those adenine analogues showing more significant structural deviations (Table 2) present a more complex range of states, despite being a relatively similar family of molecules. Of particular interest are pA and qAnitro, both of which exhibit an \(S_1\) state with a high cross-sectional values of 48.2 and 42.6 GM at 2.90 and 1.96 eV, respectively, compared to an experimental value for pA of 6.6 GM; these \(\pi \)\(\pi ^*\) states both show charge transfer character from the purine moiety to the portion of the analogue furthest from the hydrogen bonding centres, a character that is particularly pronounced in qAnitro; both analogues also show additional states possessing high cross sections between 3.50–4.00 eV. It is also worth noting that qAnitro also presents an additional set of states with exceptionally high cross sections at \(\approx \) 4.25 eV; these states are, however, significantly higher than would be ideal for use in a biochemical environment. In comparison, qA and qAN1 possess a more muted spectra with the lowest lying state for each analogue lying at \(\approx \) 3.20 eV. This trend in cross-sectional values, coupled with analysis of the orbitals involved in high cross-section transitions (Fig. 4), suggests that creating an electronic alteration to the upper ring of the qA molecule to promote charge transfer character in the excited state can have a significant effect on the cross sections of accessible states.

Fig. 4
figure 4

Lowest lying states presenting high cross sections, as determined by QR-BLYP//cc-pVTZ,within the TPA manifold, of each modified member of the quadracyclic adenine (qA) family of analogues. qAN1, top; qAnitro, middle; pA, bottom

Analogues for the pyrimidine bases (Fig. 2), involving significant structural variations when compared to their canonical base counterpart (Fig. 1), show the red-shifted spectra associated with all analogues, along with significantly increased cross-sectional values across the majority of the spectra (Table S2). However, due to the low density of states found in these analogues, the majority of states showing cross sections of greater than 10 GM lie at too high an energy to be ideal for the application of TPA methods to an in vivo environment. Exceptions to this can be seen in the 1,3-diaza-2-oxophenoxazine analogue (tC \(^O\)) and its sulphonated equivalent, tC. Both of these compounds present \(S_2\) and \(S_3\) states that lie at relatively low energy, and possess high cross-section values; with the \(S_3\) state of tC \(^O\), at 3.91 eV, showing the largest cross section of 24.8 GM. In the case of both tC \(^O\) and tC, the \(S_3\) state presents \(\pi \)\(\pi ^*\) character resulting in charge transfer from the pyrimidine moiety into the extended structural manifold; in tC \(^O\) the charge is dispersed across the upper rings of the modified structure; in comparison, charge is isolated more readily on the additional group VI heteroatom (sulphur) of tC. In contrast, the \(S_2\) state of tC \(^O\), at 3.79 eV and 19.1 GM, shows \(\pi \)\(\pi ^*\) in which charge is isolated to the pyrimidine moiety.

The values presented for these analogues in an isolated state highlight the significant potential for applicability in TPA-based spectroscopic analysis, particularly pertaining to identification of transport and localisation pathways within a cell. However, little can be inferred from these data as to the robustness of the spectra of each analogue to electronic interference, commonly brought about through interactions with surrounding bases.

3.2 Effects of H-bonding within Watson–Crick base pairs

The formation of Watson–Crick base pairs between the canonical bases (Fig. 1) produces a TPA spectra which can be primarily described as an overlap of the monomeric spectra. The most notable change in the dimers (Table 1), when compared to the isolated bases, is a significant reduction in the \(\hbox {S}_1\) energies. However, no significant change is observed in the cross-section values for the states considered.

Given the relatively weak TPA spectra of the canonical bases (Table 1) when compared to their analogues (Tables 2, 3, and S2), combined with the nature of the intramolecular hydrogen-bond network, dimers formed between a canonical base and an analogue of its corresponding Watson–Crick pair can be described as falling into one of three categories. These categories, described by the character of TPA-accessible \(\pi \)\(\pi ^*\) states are: (i) excitations to states with high TPA cross sections are isolated solely to the analogue structure with little or no charge crossing the hydrogen-bond network; (ii) excitations show movement of charge across the hydrogen-bond network, whether from base to analogue or vice versa; (iii) formation of the dimeric structure results in the activation of the canonical base such that excitations to some states present minimal electronic character on the analogue.

3.2.1 Base pairs with adenine

Dimer formation between adenine and \(^{diox}\)T (A-\(^{diox}\)T) results in the transitions isolated to the analogue and equating to the monomeric high cross-section states (Table 4) undergoing a red shift of \(\approx \)0.2 eV (Table S2); additionally, three new high cross-section states are introduced. Of particular interest, however, is \(S_{13}\), the lowest lying high cross-section state of the dimer and presenting \(\pi \)\(\pi ^*\) character solely isolated on the adenine residue; this relatively low lying photoactivation of the canonical base, observed in a number of dimeric structures, presents an intriguing avenue of investigation in the development of targetted photodynamic methodologies.

In comparison, A-6AzaO and A-6AzaS show relatively unaltered spectra in terms of the spread of high cross-section states though with a notable reduction of the cross-section values (Table S2).

3.2.2 Base pairs with guanine

Analogues for the cytosine base, when dimerised with guanine, present a blue-shifted spectra when compared to the adenine dimers formed with thymine analogues; this blue-shifted spectra were also observed in the monomeric structures.

While \(S_{14}\) of the G-\(^{DMA}\)C dimer shows a slight increase in cross-section value compared to the monomer (Table 4), the most striking feature of the G-B dimers is the near complete reduction of activity in the sub-5 eV region of the G-tC dimer. This reduction can be primarily attributed to the hydrogen bonds formed with the guanine base causing a significant blue-shift in the spectra of the tC analogue while the spectra of the guanine itself is relatively unaffected.

3.2.3 Base pairs with thymine

In a similar manner to the spectra of the Watson–Crick hydrogen bonded pair, the spectra involving analogues to the adenine base, when hydrogen bonded to the thymine base (Table S2), can be considered as a simple overlap of the monomeric spectra with minimal effects of hydrogen bond formation on either structure. While negligible shift is observed in the dimeric spectra, some notable reduction is observed. This reduction can be seen in states \(S_{12}\) and \(S_{14}\) of qA, though the cross section of \(S_{3}\) remains relatively unchanged; and states \(S_{14}\) and \(S_{25}\) of qAN1. However, notably, the cross-section values of T-qAnitro and T-pA remain relatively unchanged when compared to their monomeric spectra (Table 2).

3.2.4 Base pairs with cytosine

The C-6-MI (Table 2) dimer also presents a spectra described predominantly as an overlap of the cytosine and 6-MI spectra (Tables 1 and 3, respectively). However, a new low-energy, high cross-section state is observed in the \(S_2\) position (3.46 eV) which presents a reasonable cross-section (15.8 GM) when compared to the more prominent analogues (e.g. qAnitro and pA).

3.3 Effects of nucleobase \(\pi \)-stacking

Compared to the changes in the spectra of each analogue upon the formation of a hydrogen bonded dimers (Tables S2 and S3), the effects of \(\pi \)-stacking (Tables S6–S13) are much more pronounced. In a manner similar to that of the hydrogen-bonded dimers, interactions between the \(\pi \)-stacked monomers do open a number of charge transfer states not present in either monomer; however, the most notable, and key, observation is that \(\pi \)-stacking interactions appear to result in a severe reduction in cross-section value in a large number of the high cross-section states discussed so far. It is worth note that, while the reduction in cross section does bring the QR-DFT values more in line with those observed experimentally even the largest reductions, observed when \(\pi \)-stacking with the thymine base (Table S12), do not on their own provide an explanation for the discrepancy noted between the experimental and theoretical results.

While this observation is not necessarily the most surprising due to the predominantly \(\pi \)\(\pi ^*\) nature of the excited states of the analogues, these data act to strongly reaffirm that effects of \(\pi \)-stacking on new analogue candidates can, and should, act as a major design consideration when quantifying their feasibility, even before the candidates are tested in a more harsh, in vivo environment.

These trends, consisting primarily of reduction of lower lying states with a rare observation of new high cross-section states is reliably observed when considering the analogues as they \(\pi \)-stack with each canonical base.

3.3.1 Purine analogue \(\pi \)-stacking

In a similar manner to the formation of hydrogen-bonded dimers (Tables S2 and S3), a distinction can be readily drawn between those purine analogues possessing an extended structure (qA, qAN1, qAnitro, and pA), and those more closely resembling their canonical counterparts (2AP, 6MAP, and 6-MI). Of the more structurally similar analogues, 2AP is rendered relatively inaccessible in the sub-5 eV region and 6MAP presents only a pair of high-energy accessible states at \(S_{24}\) and \(S_{25}\), each lying just under 5 eV. The B-\(\pi \) -6MI dimers are also observed to undergo significant reduction of their cross-section values up exposure to a \(\pi \)-stacking environment (Table 3).

As with both the isolated monomers (Table 2) and the hydrogen-bonded dimers (Table S3), the spectra of the \(\pi \)-stacked dimers of the qA family of analogues appear to be significantly more accessible than the smaller purine analogues, with particularly promising high cross-section states at the lower-energy regions of each spectra. The starting moiety for this family, qA, shows a notable degree of variation in the spectra depending on the base with which it is interacting; \(\pi \)-stacking with a purine base results in the relative reduction of the low energy states (\(\approx \) 3.4 eV) as well as the higher energy states (\(\approx \) 4.7 eV), while only a small reduction in cross-section value is observed when stacked with a pyrimidine base.

The alteration of the outer phenyl moiety of qA to the pyridyl moiety found in qAN1 does not appear to alter the trend observed \(\pi \)-stacking. However, a notable observation in the B-\(\pi \)-qAN1 spectra is the significant reduction of the monomeric \(S_6\) state (Table 2) upon formation of a \(\pi \)-stacking interaction, regardless of the base.

Fig. 5
figure 5

Lowest lying states presenting high cross sections, as determined by QR-BLYP//cc-pVTZ, within the TPA manifold, of each modified member of the quadracyclic adenine (qA) family of analogues upon \(\pi \)-stacking with the guanine base. qAN1, top; qAnitro, middle; pA, bottom

Despite the reduction effects of the \(\pi \)-stacking interactions, qAnitro remains most accessible of the analogues studied here. Of particular interest is the monomeric \(S_1\) state (Fig. 4) which, as well as occurring at a low energy (\(\approx \) 1.95 eV), maintains a high cross-section value in each of the B-\(\pi \)-qAnitro dimers, with only the G-\(\pi \)-qAnitro dimer showing any significant reduction in cross-section value. A notable characteristic of these transitions is that, even in the presence of the \(\pi \)-stacked environment, there is minimal change to the overall excited state character (Fig. 5). In comparison, the \(S_4\) state of the qAnitro monomer (4.58 eV; 14.4 GM) is quenched in each of the B-\(\pi \)-qAnitro dimer, with the exception of T-\(\pi \)-qAnitro where it is slightly destabilised (\(\Delta E\) = 0.04 eV) and the cross section (15.5 GM) remains in the same range as that of the monomer (Table  2). Additionally, a number of dimer specific high cross-section states are observed in the 3.9–4.6 Å region (Tables S6–S9), determined by base specific shifting of the monomeric \(S_9\), \(S_{10}\), and \(S_{14}\) states (Table 2).

Regarding the overall chemistry of the qAnitro analogue: while the excited-state chemistry in the hydrogen bonded dimers appears to be dominated by the drawing of charge towards the \(NO_2\) group of qAnitro, upon \(\pi \)-stacking, that chemistry would appear to be inverted, with the majority of high cross-section states implying the movement of charge away from the \(NO_2\) moiety either onto the interacting base, or throughout the dimeric structure.

Contrary to most of the other purine analogues, B-\(\pi \)-pA dimers (Tables S6–S9) show minimal change to their spectra when compared to that of the pA monomer (Table 2) with the most notable effect of \(\pi \)-stacking being mild reduction observed across each B-\(\pi \)-pA spectra. However, due to the large cross sections observed in the pA spectrum, even when quenched by the \(\pi \)-stacking interactions, most high cross-section states still remain viable targets. The \(\pi \)-stacking interactions formed with the pyrimidine bases (Tables S8 & S9) appear to have more of an effect than their purine counterparts with the T-\(\pi \)-pA dimer presenting the largest reduction in the cross-section value of the high-interest \(S_1\) state.

3.3.2 Pyrimidine analogues \(\pi \)-stacking

In comparison to the purine analogues, \(\pi \)-stacked spectra of pyrimidine analogues (Tables S10–S13) show significantly less variation when compared to their monomeric counterparts (Table S2) with regards to the energetic shifting of states in addition to the introduction of additional states under the 5 eV threshold. This reduced variation is observed across the studied analogues in spite of the high degree of structural variation involved in the analogue structures (Fig. 2).

Across each of the spectra studied through this work, only the high energy states of the B-\(\pi \)-6AzaO and B-\(\pi \)-6AzaS structures (\(\approx \) 4.8 eV), as well as the lower energy states of the B- \(\pi \)-tC\(^O\) and B-\(\pi \)-tC dimers (\(\approx \) 3.8 eV) remain relatively accessible.

4 Conclusion and outlook

The applicability of expanding the use of modern nucleotide base analogues to include two-photon spectroscopic methodologies is evident from the data presented throughout this work; however, these data also highlight that, while a number of analogues assessed would warrant use in an in vitro or ex vivo setting, the vast majority of the states possessing a sufficiently large cross-section value are at a wavelength too high to allow for the tissue penetration desirable in an in vitro or in cellulo setting. Of the analogues studied here, only qAnitro and pA, with an \(S_1\) state significantly under the 3 eV mark and cross-section values consistently over 20.0 GM across both hydrogen bonded and \(\pi \)-stacked dimeric structures (Tables S3, S6–S9) stand out as potential as candidate for use with TPA methodologies. However, given that qAnitro is not reported to fluoresce, it cannot be recommended for use as this property is independent of the OPA or TPA methods used. Instead, we can look to the structure of qAnitro, coupled with the newly presented ABN analogue [67], for insight into the design of analogues with high cross sections. As can be seen in the orbital transitions of qAnitro (figures 4 & 5), and the word done by Samaan et. al. [67], there is significant merit in pursuing the construction of a push–pull motifs, commonly considered a hallmark of optically bright organic fluorophores, as a mechanism for enhancing the TPA cross section of future FBAs.

In comparison, the next best candidates (tC and tC \(^O\)), while presenting high cross-section \(\pi \)\(\pi ^*\) states in the sub-4 eV region, these are quenched upon \(\pi \)-stacking with any base (Tables S10 - S13), resulting in the lowest lying high cross-section state across all environments being calculated in the 4–5 eV range; however, the environmental dependence of the low-lying state of tC may find use in niche situations in following the incorporation pathways of nucleotide bases throughout the cell.

Throughout this study, a number of design and testing principals have begun to emerge as worth addressing in regards to the development of new two-photon-based analogues for the canonical bases. Primary amongst these is the need for the promotion of lower-lying high cross-section states to enhance applicability in an in vivo environment; particularly given the observation that only qAnitro and pA present high cross-section states in the sub-3 eV region both of which lie just within the ideal range for tissue penetration [120,121,122]. The second, stand-out, design consideration is that the \(\pi \)-stacking environment represents a substantial factor in the accessibility of low-lying states. The data presented here would also suggest that the presence of an extended, conjugated, polycyclic framework within the analogue structure has a significant effect of the two-photon cross sections of an analogue. In addition, the effects of changing from the phenyl moiety of qA to a more electron withdrawing pyridyl moiety as seen in qAN1, or the inclusion of a strong electron withdrawing group, such as \(NO_2\) present on qAnitro, shows that it is possible to promote the preservation of the monomeric spectra in a dimeric environment by preventing the distribution of charge through the incorporation of small electronic substituents, presenting a promising design feature for the development of novel analogues with a higher fluorescent capacity.

One of the primary effects of including these modifications is to move charge away from both the hydrogen-bonding and \(\pi \)-stacking environments (Fig. 4), acting to protect the excited state character from the changing chemical environment (Fig. 5), preserving the high cross-section values that define the modified members of the qA family of analogues. In light of this, the investigation of the effects of dedicated electron donating and withdrawing groups would be warranted to assess the potential of local base activation with the aim of moving from biochemical probes to the potential for gene targetted photosensitising compounds.

Given the design principles discussed here, it would follow that the development of two-photon analogues relating to A and C would offer significantly more design flexibility than those relating to the structures of G or T; this is primarily due to the idea that any addition to the structure is more limited when there is a need to preserve the carbonyl group involved in the hydrogen-bonding of G or T. This, in comparison to the amino group found on A and C, can still effectively take part in the hydrogen-bonding network whether as a primary amine (\(^{DMA}\)C, 2AP & 6MAP) or as a secondary amine (tC \(^O\), tC, and the qA family of analogues). There is potential for investigation into the effects of substitution of the hydrogen-bonding carbonyl group of G and T for an imine group (–N=) to preserve an accessible lone pair while allowing for extension of the polycyclic framework, but it is uncertain how this would effect the hydrogen-bond framework of these compounds.

In the design of future analogue, it is worth noting that a common property in the majority of the analogues studied is the presence of a minimal to negligible permanent dipole moment in comparison to highly TPA active organic photophores. This results in a reliance solely on effects of the transition dipole between the ground and excited state, a single factor in the determination, and scaling, of the TPA viability of a molecule [123,124,125]. The investigation of novel analogues specifically designed to incorporate a permanent dipole may open up new avenue in the development of promising TPA candidates; this avenue, however, will come with challenges presented in maintaining the permanent dipole throughout the differing hydrogen-bonding and \(\pi \)-stacking environments experienced by the analogues.

This investigation also highlighted that, while providing valuable insight into the photochemistry of these compounds, the inclusion of either hydrogen-bonding or \(\pi \)-stacking effects with a single nucleobase was insufficient to explain the discrepancy observed between the theoretical and experimental results. Further investigation is warranted to determine appropriate quantitative improvements to the model system utilised for similar and future studies. These improvements may involve the inclusion of: i) more than one \(\pi \)-stacking base to sandwich the analogue; ii) the ribose sugar, which may have an effect on both the geometry adopted and the excited state character; iii) explicit solvent molecules and key coordination sites; and iv) the combination of both \(\pi \)-stacking and hydrogen-bonding effects within the same calculations. Due to the, potentially drastic, increase size of models accounting for these additions the use of alternative methodologies, better equipped to deal with larger structures, should also be probed, including the use of: the Cholesky decomposition; resolution of identity; or entangled TPA methodologies.

In conclusion, while the majority of current nucleotide base analogues do not lend themselves to uses with two-photon methodologies, there are a number of promising candidates as well as significant design potential for the targetted development not only of novel, two-photon dedicated analogues, but also in the development of analogues specifically designed to take advantage of both the increased resolution and tissue penetration of two-photon methodologies in the design of photosensitising compounds that can be embedded in to a given DNA primer to enable photoinduced, gene sequence targetted, DNA damage whether in pathogens, or in cancer cells.