Introduction

DNA G-quadruplexes (G4s) have structural features different from the classical double helix; they are quadruple helical structures that express variable and dynamic topology dependent on sequence and environment. G4 structures are formed in nucleic acid sequences rich in guanine. Due to the possession of two perpendicular faces of H-bonding functionality, with opposite hydrogen bonding polarities, guanine is able to base pair with two separate guanine residues, forming planar structures called G-tetrads (Fig. 1a) [1, 2]. These G-tetrad motifs arise from the association of four guanines into a cyclic Hoogsteen hydrogen bonding arrangement, and they stack on top of each other to form a G4 (Fig. 1b, c), which formation is favoured and stabilised by monovalent cations, with potassium being the most effective stabiliser followed by sodium, as well as by π-π interactions between the G-tetrad motifs [3].

Fig. 1
figure 1

a Representation of a G-tetrad, showing the two bonding faces, Watson-Crick and Hoogsteen b Cartoon representation of an unimolecular parallel G4 (PDB: 1KF1) lateral view, and c top view. Potassium ions are represented as spheres in blue and purple

Structures covered in this review contain one, two or four separate strands, referred to as unimolecular, bimolecular, and tetramolecular, respectively. G4 forming sequences on a single strand can interact intramolecularly to form G4s. Unlike duplex DNA, the strand 5ʹ-to-3ʹ directionalities (or polarity) are not generally constrained (to antiparallel), and collectively, their orientation is useful in the topological classification of the quadruplex. If the polarities of all strands are orientated in the same direction, then the quadruplex is said to be parallel. In contrast, if the strands are orientated so that their polarities are opposite to their neighbours, the quadruplex is termed to be anti-parallel in nature. Naturally, deviations from this architecture are observed that contain both parallel and anti-parallel characteristics, and these are termed mixed or hybrid topologies. In quadruplex systems with fewer than four separate chains, short loop regions, of predominantly thymine and adenine content, link associating G-tracts. The length and sequence composition of these loop regions determine preference for topology alongside salt character/concentration. Loop regions can generally adopt one of three types; propeller, connecting adjacent G-tracts and preserving parallel polarity; lateral, connecting adjacent G-tracts but reversing polarity; and diagonal, connecting opposite G-tracts and also reversing polarity. Common examples of G4 topologies are illustrated in Fig. 2. Note: Due to the extensive polymorphism exhibited by G4s, a classification of topology using classical helical parameters is inappropriate, and other classifications have been developed [4].

Fig. 2
figure 2

Structural topologies of commonly observed G-quadruplexes

Davies et al. first hypothesised in 1962 that guanylic acid (GMP), a guanine analogue, could form tetrameric structures in solution, and a structural model built from fibre diffraction data showed that the structural basis was the G-quartet of Fig. 1a [1, 5]. The formation of such G-quartets in a single DNA strand (unimolecular structure) was shown in 1994. An NMR structure of a 22-mer DNA single strand with the human telomeric sequence d(AGGGTTAGGGTTAGGGTTAGGG) in Na+ solution revealed the antiparallel basket topology, with two lateral and one diagonal loop [6]. In 1998, the telomeric fragments TTAGGGTT, TTAGGGTTA, and TAGGGTTA were studied by NMR, and a parallel intercalated structure was proposed for the TAGGGTTA sequence and a 3,4,9,10-perylenetetracarboxylic diimide-based ligand [7]. The ligand is modelled as bound between a G4 and T4 tetrad. No coordinates for this model are available, but the same sequence was reported by us in 2019 to form an antiparallel assembly, using a ligand with a preference for an antiparallel topology (PDB: 5LS8; see later) [8]. The first crystal structure of a unimolecular G4, using the 22-mer telomeric sequence DNA above, was reported by Parkinson, Lee and Neidle in 2001, crystallised in the presence of the physiologically relevant K+ ion, and revealing a parallel topology with propeller loops (PDB: 1KF1) [9]. At time of writing, a search of the Nucleic Acid Knowledge Base (NAKB) for DNA structures classified as containing a G4 motif gives 123 X-ray structures and 259 NMR structures (including structures with bound ligands), and it remains a highly active field, with the emphasis now shifted to unimolecular non-telomeric G4s.

The prevalence and distribution of G4 structures in the human genome have been addressed using two different approaches: computational prediction and high-resolution sequencing-based methods [10,11,12]. These non-canonical structures have been found in vitro and in vivo located in telomeres as repeated units, where they directly inhibit telomere elongation. Additionally, they are found in the promoter region of oncogenes, such as c-KIT, c-MYC, VEGF, and BCL2, where they play a key role in regulating gene expression [13,14,15,16,17,18,19,20,21,22,23,24]. It is estimated that more than 40% of human genes contain in their promoter region at least one quadruplex motif [25].

Transcription factors (TFs), which are proteins that control the flow of genetic information from DNA to mRNA, can interact with DNA in two different ways: “base readout” and “shape readout”. The former is a physical interaction between the amino acid side chains of the TF with the base pairs of DNA, primarily situated in the major groove, allowing TF proteins to bind to specific DNA sequences. The latter, “shape readout”, is based on the recognition of a particular DNA structural feature, such as G4s [26, 27] For this reason, the DNA G4 motif has gained considerable attention as a possible therapeutic target. Selective stabilisation of telomeric G4 structure can inhibit the telomerase enzyme activity, leading to the destabilisation of telomere maintenance and consequently inhibiting cancer cell growth. Additionally, small molecules capable of stabilising a specific G4 structure can be used to inhibit particular oncogenes offering a targeted approach to cancer therapy.

A large number of G4-targeting ligands have been designed and synthesised, most of which possess common features, such as a polycyclic aromatic core essential for п-stacking with the face of the G4 quartet, the presence of cationic charges positioned at the centre of the G-quartet, and the presence of positively charged substituents that can interact electrostatically with the negative sugar-phosphate backbone of DNA [28, 29].

Alongside DNA G4, RNA G4 motifs have also been demonstrated to exist in vitro and in vivo, playing crucial roles in different biological processes, including modulation of transcriptional, co-transcriptional, and posttranscriptional events [30,31,32]. Therefore, the involvement of these structures in key biological processes makes RNA G4 an important biological target. However, as this review focuses on metal complexes targeting DNA G4 structures, please see the more comprehensive discussions on the role of RNA G4 as potential therapeutic targets found elsewhere [32,33,34].

As interest in developing drugs that target G4 structures grows, so does the need to understand their three-dimensional structure and how they bind G4 ligands. In most cases, the structures derived from NMR and X-ray crystallographic data are the same, but not always, as interactions between loops in the crystal can favour parallel topologies. We have included a short section comparing the two techniques in our closing summary, as both are indispensable in this area.

As far as the choice of complexes is concerned, transition metal-based compounds, such as those of ruthenium with their great variety in structure and rich photophysical and photochemical properties, have shown great potential as anti-cancer agents, anti-microbial in photodynamic therapy (PDT), and molecular probes [35,36,37,38]. Their photophysical and photochemical properties originate from a variety of excited state-electronic configurations accessible with visible and near-infrared light. In general, upon light irradiation at a wavelength which is in the range of absorption of the specific complex, an electron is transferred from the ground state (S0) to the singlet excited state (S1), which subsequently undergoes rapid internal conversion (IC) to the lowest singlet excited state. From there, rapid dissipation of energy from the lowest singlet excited state to the ground state can happen through a process referred to as fluorescence. Alternatively, the electron may undergo a “forbidden” transition to an overlapping triplet state (T1) by a process called intersystem crossing (ISC). Molecules at the lowest triplet excited state can relax to the ground state with photon emission through a process called phosphorescence or transfer energy from the triplet excited state to nearby oxygen or substrate, leading to the generation of reactive oxygen species (ROS), in this case, singlet oxygen is generated.

Transition metal complexes characterised by a d6 and d8 electronic configuration have attracted great attention thanks to their great stability and their characteristic three-dimensional structure, with octahedral and square planar complexes the best-studied geometries.

Different transition metal complexes have been designed and synthesised to target DNA via non-covalent interactions, spanning from the most biologically common form, B-DNA (e.g. metallo-intercalation) [39,40,41] to G4 structures [42, 43]. In general, metal complexes that target the G4 are characterised by having a ligand with a large surface area in order to π-π stack with either or both end G-tetrads of the G4 structure. A square planar complex can stack directly, whereas an octahedral complex should have an extended planar ligand so that the metal centre can lie in one of the grooves. These complexes can be completely selective for G4s. Related complexes of these metals, with smaller or non-planar ligands, have been optimised to target errors in B-DNA that can arise during replication: single and consecutive double mismatches, an interaction that occurs via metallo-insertion [44,45,46]. All these types of DNA recognition are achievable using metallo-drugs, sometimes just by changing a single ligand. The use of bulky lipophilic ligands is also a crucial strategy for localising the metal complexes in the nuclei of cells or in mitochondria, which is essential for targeting DNA [47, 48]. Therefore, the versatility and adaptability of transition metal complexes make them powerful tools in the recognition of various DNA structures [49].

In summary, this review aims to provide a short overview of the interactions between metal complexes and G4s, in particular metal-salphens, gold N-heterocyclic complexes, and ruthenium polypyridyl complexes, with a particular focus on structural insights obtained through X-ray crystallography and NMR.

Substituted acridines

To put the more recent studies of metal complex binding in context, this short account starts with the seminal studies of acridine binding carried out by Neidle and coworkers, which laid the foundation for subsequent work. Importantly, it was shown that the strength of G4 stacking, unlike that of the B-DNA double helix, is typically such that intercalative binding is not seen. Instead, the characteristic binding mode is end-stacking. There are two such ends that are not normally equivalent, and in some of the studies discussed below, a preference for one end is evident.

Hurley, Neidle et al. were the first to describe the ability of a small molecule to selectively inhibit telomerase activity. Aided by a structure-based approach to drug development and spurred on previous investigations with triplex interactions, they discovered that a class of compounds, the 2,6-diamidoanthraquinones, were potent G4 binders which disturbed the enzymatic action of telomerase (IC50 of 23 µm) [50]. A systematic analysis of analogues, with both the π-stacking heterocyclic and the flexible amido chain regions varied, determined the factors promoting a more selective interaction. It was demonstrated that an acridine-based aromatic core was more active than the anthraquinone moiety in terms of telomeric G4 binding. This was associated with the introduction of a nitrogenous heterocycle that, under physiological pH, could be protonated to provide an electron-poor chromophore, complementing the central K+ channel [51]. Further structure-activity relationship (SAR) analyses led to further development of bi- and tri-substituted amidoacridines [52]. It was determined that bulky non-aromatic substituents on the side chains destabilise the G4, but it was concluded, using a combination of solution experiments and computation, that the 3,6,9-trisubstituted aminoalkylamido acridines were the most potent of the three regioisomeric series examined. Specifically, the compound coined BRACO-19 (PDB: 3CE5, Fig. 3a) emerged as the top candidate, exhibiting high target selectivity/affinity to G4s, as well as rapid uptake by host cell nuclei [53]. BRACO-19 became one of the most intensively studied quadruplex ligands, inducing long-term growth arrest and replicative senescence in carcinoma cell lines in vivo, as well as being used to demonstrate the positive regulatory role of G4s in the transcription of the hepatitis B virus [54, 55]. Given this promise, it was disappointing that BRACO-19 failed to reach clinical trials as a result of inherent solubility and membrane permeability limitations [56].

Fig. 3
figure 3

3,6-bis- and 3,6,9-tris-substituted acridines successfully crystallised with G4 forming oligonucleotides. a species containing protonated acridine core; and b species containing origin species. PDB codes of coordinates are given in bold

The structures determined as part of this work show many of the key features also seen with the subsequently studied metal complexes. Figure 4a shows the native oxytricha nova G4 sequence d(GGGGTTTTGGGG) determined at 1.6 Å resolution by X-ray crystallography. Figure 4b shows the interaction of a 3,6-disubstituted acridine with the same sequence (PDB: 1L1H).

Fig. 4
figure 4

The oxytricha nova G4. a Native bimolecular d(GGGGTTTTGGGG) determined by NMR; b 3,6-disubstituted acridine bound to d(GGGGTTTTGGGG) determined by X-ray crystallography; c Superimposition of the native and bound complexes highlighting the similarity in DNA morphology, with the ligand omitted for clarity

The structure of the native DNA shows that the G4 takes on a bimolecular anti-parallel arrangement with diagonal loops, agreeing with previous solution-state NMR studies of the sequence in both Na+ and K+ ionic environments [57]. Instructively, but not always the case in this field, very little change to the DNA global structure is observed upon binding of the acridine derivative, as seen in Fig. 4c. The ligand binds at a 1:1 stoichiometry to the biological unit and binds through one diagonal loop in an end-capping/threading mode, π-stacking predominantly on two anti-guanosines located on one side of the terminal tetrad. The pyrrolidinopropioamide chains, thought to interact with the grooves, are splayed out towards the grooves but are not long enough to penetrate them. The protonated ends take part in weak H-bonding with an exocyclic N2 of guanine and with local ordered water. Although global DNA structure is conserved upon binding, local variations in loop geometry are observed. Thymine-3 in the binding loop pocket rotates to stack on top of the threading acridine core, and thymine-4 in the loop flips out of plane into the mouth of the wide groove to accommodate the ligand.

The structural effect of adding bulky substituent to the end of the amido chain of these complexes was studied [58]. The study concluded that the addition of steric constraints to the chains, in the form of progressively larger pyrrolidino rings, does not hinder the ability to bind to the diagonal anti-parallel loops. The nine available structures of disubstituted acridines are superimposed in Fig. 5, clearly demonstrating the consisting binding site across all these structures (Fig. 5c). Given this, the role of the loops in determining specificity was investigated. Short diagonal loops (≤4 nucleotides) could allow wider ligands to bind, whereas short propeller-type loops generate a far more constricted binding pocket. This difference could explain the drops in association constant (Ka) with the parallel stranded human telomeric sequence when the amido chains are substituted with piperidino (7-fold) or azocano (3-fold, PDB: 3EUM) rings, respectively. In a similar study, fluorination of the peripheral pyrrolidine moieties on the alkylamido chains was investigated [59]. Thermal melting analysis suggested that the β-fluorinated analogues exhibited at least half the stabilisation effect on the bimolecular anti-parallel quadruplex in comparison to the parent. Crystallographic studies of the binding pattern of bis-3-fluoropyrrolidine enantiomers -(R,R) and (S,S)- to the same oxytricha G4 yielded almost isomorphous binding morphology to the parent complex, showing a terminal heterocyclic pucker and the subsequent small hydrogen bonding changes. The authors attributed the loss in stability to this new H-bonding network since the principal interactions were now associated only with the uppermost loop, suggesting that the ligand cannot anchor the two strands together. Overall, this detailed study highlighted the key importance of π-stacking in determining the fixed binding orientation of a chromophore such as acridine, and the potential of side chain substitution to increase selectivity and potency.

Fig. 5
figure 5

Substituted acridine binding. a Superimposition of the nine d(GGGGTTTTGGGG) sequences, with the disubstituted acridine ligands omitted, to highlight similarities in the DNA morphology. b, c Superimposition of the same nine structures of DNA-disubstituted acridine complexes in two different perspectives. Note the similarity in DNA architecture and the rigidity in the binding of the π-stacked acridine core, which maintains the same orientation regardless of the substitution pattern

Binding to G-quadruplexes – structural insights

Since the demonstration that G4s were transient forms in cells [20], the therapeutic significance of G4 targeting drugs has become apparent. Many compounds with diverse structures have been tested for efficacy as G4 binders but only a small fraction of them have been structurally characterised with G4s. A search of the Protein Data Bank (PDB) (or the Nucleic Acid Knowledge Bank (NAKB)) yields nine sets of experimentally derived structural coordinates of the G4-metal complex systems obtained using X-ray crystallography and nine using NMR, as shown in Tables 1 and 2, respectively.

Table 1 Summary of all X-ray structures containing a G4-metal complex available in the PDB
Table 2 Summary of all NMR coordinate sets containing a G4-metal complex available in the PDB

Metal complexes

The importance and therapeutic application of metal ions in nucleic acid chemistry has been known and studied for a long time. The ionic atmosphere around DNA, due to the negative charges on the phosphate backbone, affects everything from the local folded structure to the vulnerability of the genomic code to damage [60, 61]. Specifically, in the telomeric assembly, there are two central K+ ions, each coordinated in distorted square antiprismatic coordination to eight guanine bases, through the carbonyl group at the 6-position (Fig. 1). These K+ positions are integral to biologically relevant G4 structures, clearly defined using X-ray diffraction, but have to be inferred in NMR structure determinations. The binding of positively charged metal complexes can also contribute to the neutralisation of the overall negative charge of the G4. To date, metal complexes of Co(III), Ni(II), Ru(II), Pt(II) and Au(I)/(III) have been studied as potential therapeutics and probe molecules, making use of the combination of extended planar ligands and either square planar or octahedral geometry. The ligand binding preferences of these metals allow an endlessly diverse range of complexes to be generated. None to date have been studied in quite such a systematic way as potential therapeutics as the acridine example above. Given the large interest in such complexes, rather little structural information has been published showing their interactions with G4s. Currently, only 18 structures are present in the PDB (nine X-ray diffraction and nine solution NMR), and the area is ripe for further development.

Metal salphens

Initially developed by qualitative modelling investigation, the metal-salphens have since been shown to be strong G4 binders and potent inhibitors of telomerase [62,63,64]. Consisting of a heteroaromatic bis-Schiff base derivative 4-coordinated to a square-planar/pyramidal metal centre, this family of complexes has been systematically optimised to be proficient binders of telomeric G4. Indeed, different strategies like the design and synthesis of dimeric Pt(II) salphen compounds to target consecutive G4s at the telomeric region have been employed [65]. Central metal ion type, coordination geometry, and substituent effects have all been investigated.

For this purpose, Vilar et al. investigated a series of metal complex (Ni2+, Cu2+, Zn2+, and V4+) analogues with DNA having square-planar and square-based pyramidal three-dimensional structure [63]. These experiments highlighted the importance of the coordination geometry around the metal complex, demonstrating how square-planar metal-centres such as Ni2+, thanks to this geometry, can easily п-stack on top of the G-quartet, stabilising the G4 structure. On the other hand, a square-based pyramidal geometry, as in the Zn2+ complex, displayed almost no affinity towards G4.

In addition to increasing solubility, the number and nature of the substituents located on the salphen ligand largely dictate the resulting affinity and structural selectivity of the complex. As with the acridines, FRET analysis helped to establish that pyrrolidinium and piperidinium were the most suitable heterocyclic ends for the ether-linked alkyl substituents [63]. However, Vilar et al. found that derivatisation around the central phenyl ring was more important. (Fig. 6a). Phenyl ring substitution always led to a decrease in FRET melting temperature with a telomeric 22-mer sequence [42], but substitutions can also improve selectivity between duplex and quadruplex DNA. Investigation into the effect of the central ion concluded that square planar Ni(II) and Cu(II) complexes were more stabilising and showed higher antiproliferative properties and effective telomerase inhibition when compared to the pseudo-square-4pyramidal Zn(II) and V(IV) complexes; presumably because the square planar coordination allows the metal to sit close to the K+ channel [66].

Fig. 6
figure 6

Salphen metal complexes. a The salphen complexes used, with the fluorinated central phenyl ring; b, c Two views of the copper complex (PDB code 3QSC) bound to d(AGGGTBrUAGGGTT), highlighting the lateral and top view, respectively; d Superimposition of the two salphen structures, 3QSF and 3QSC, coloured as blue light and grey, respectively, highlighting the close similarity between the two structures; e, f Two views of the nickel complex analogue (PDB code 3QSF)

Crystal structures of square planar Ni(II) and Cu(II) metal salphens (Fig. 6a) bound to a bimolecular brominated sequence based on the human telomeric d(GGGTTA)n unit have been reported [42]. Both structures (Fig. 6b–f) contain biological units comprised of a bimolecular all-parallel quadruplex formed by d(AGGGTBrUAGGGTT) with a two-fold symmetry axis running down the central (helix) axis, and the metal complex disordered about this axis. Hence, in Fig. 6c, f, which are projections down this symmetry axis, only one of the two disordered complexes is shown for clarity, but the asymmetric unit (smallest repeating unit) is the 12-mer single strand. The assembly is stacked on a second symmetry-related unit, giving a run of five K+ ions. These points are mentioned as they are representative of features often seen in crystal structures which would not be present in solution. In NMR-determined structures, the ligand binding mode would typically be the same, but such aggregation would be unusual. In these examples, the complexes are seen to bind in a typical end-capping fashion, as previously suggested by molecular modelling calculation; however, the flipping in of the terminal thymine is unexpected, stacking over the ligand as shown in Fig. 6c, f. As designed, the central metal ions of the complex are situated almost in line with the K+ channel but cannot directly coordinate to any guanine 6-carbonyl group. Although containing different metal centres, the overall assemblies are isostructural. The authors noted deviations from planarity of the salphen ligand when comparing the bound Ni(II) and Cu(II) complexes, with the Cu(II) bent out of plane. This additional bowing affects the π-stacking overlap with the G-tetrad, giving a difference in stacking distance of 0.2–0.3 Å, consistent with the lower binding affinity of the Cu(II) complex. In addition, the structure allowed the authors to propose a rationale for the decrease in affinity seen on substitution of fluorine in the central phenyl ring. Although included to increase favourable π-stacking by electron withdrawal, the structure actually shows that the substituted ring is only partially overlapping with the adjacent base, and unfavourable repulsive interactions occur between a fluorine and a guanine carbonyl group. Substitution of donor groups at this position could exploit this interaction. The structure is, therefore, a very nice example of the use of such data to interpret biophysical and biochemical results.

Related salphen-like metal complexes containing Ni(II), Cu(II), and Zn(II), but incorporating imidazole rings were subsequently studied [67]. DNA binding affinity showed that the three salphen-like complexes have a similar binding affinity towards calf-thymus DNA (CT-DNA), but differences were observed when G4 DNA was added: Cu(II) > Ni(II)> Zn(II). Interestingly, MD simulations show a loop binding mechanism of the Cu(II) analogue with the human telomeric sequence. The diversity of salphen-like metal complexes has been further extended with the synthesis and characterisation of non-charged Cu(II), Ni(II), Zn(II), Pd(II) and Pt(II) metal complexes, bearing chlorine atoms as substituent, and to salphen Co(III) complexes [68, 69].

Gold N-heterocyclic complexes

Gold-centred organometallics have been developed as potent cytotoxins with structural selectivity. Gold(I) mono/dicarbene species are especially promising candidates due to their physiological stability, antineoplastic activity, and lower systemic toxicity than earlier cytotoxic gold complexes [70]. N-heterocyclic gold(I) carbene (NHC) complexes have been shown to be potent inhibitors of mitochondrial selenoenzymes, as well as inhibitors of proteasome and telomerase activity [71, 72]. Antitelomerase activity has also been shown to be a distinct mode of action for the antiproliferative effect of Auranofin; a repurposed gold(I) thiolate-based antirheumatic agent that has been in clinical trial for the treatment of ovarian cancers, with a mode of action distinct from that established for the platinum compounds [73].

The cationic gold(I) bis-carbene, [Au(9-methylcaffein-8-ylidene)2]+ has been shown, using an in vitro FRET melting assay, to be completely quadruplex specific in its binding and, later, to be a selective cytotoxin to cancer cells [74, 75]. The binding mode of this complex to the telomeric G4 forming sequence d(TAGGG(TTAGGG)3) has been structurally characterised in a combined X-ray crystallographic and ESI-MS study [76]. The crystal structure (PDB: 5CCW) shows that the parallel topology of the quadruplex is maintained upon binding of the complex (Fig. 7a). Also supported by solution MS, the structure shows how the complex binds by end-capping on both 5′ and 3′ tetrad faces in a maximum 3:1 stoichiometry across the biological unit. Two complexes fit neatly side by side on one face, each end-stacking with two guanine bases, and showing minimal loop interaction with the propeller loops. A single complex end stacks to the opposite face.

Fig. 7
figure 7

Gold complexes. Chemical structures and refined crystal structures of the gold complexes that have been structurally characterised with G4 DNA. PDB code from left to right: a 5CCW, b 6H5R, and c 7QVQ. In this example, only a single orientation of the four-fold disordered complex is shown for clarity (see text)

In a related experiment, the interaction of a simple gold(I) bis-carbene [Au(NHC)2]+ was investigated in the presence of different telomeric G4s [77]. Crystallisation was achieved with the sequence d(TAGGG(TTAGGG)3T) (PDB: 6H5R) and similarly produced an all-parallel topology (Fig. 7b), but in this case, the complex is observed at a stoichiometry of 1:1 to a biological unit of a single DNA strand, as adjacent quadruplexes are stacked to give a dimeric unit in the crystal. Mass spectrometry also suggested a 1:1 binding stoichiometry, and melting analysis indicated no clear increase in ∆Tm. The complex, which is disordered on the G-tetrad surface, is π-stacked across two guanine residues on the 3′ tetrad; as with structure 5CCW, the metal centre is not aligned with the central ion channel.

Figure 7c shows the structure deposited with PDB code 7QVQ, and is included for completeness. The planar gold(I) complex shown has been crystallised with the parallel G4 formed by the 24-mer human telomeric sequence (Table 1). The resulting deposited coordinates and experimental measurements show that the crystal used for the analysis contains quasi-infinite stacks of the complex alternating with the parallel G4, in 1:1 stoichiometry, with unclear density for the metal complex, typical of disordered binding. The depositors have modelled four orientations of the complex into the density, with just one orientation illustrated in Fig. 7c. This is a rare example where the amount of useful information in the structural data is limited, and does not go much further than confirming the end-stacking binding mode.

Organometallic gold(III) complexes, such as the N-heterocyclic dioxo bridged binuclear complex shown in Fig. 8a, have been shown to exhibit similar in vivo cytotoxicity (low µM), and inhibition of selenoenzymes, proteasome action and telomerase, to the gold(I) carbene species [78, 79]. The complex (Fig. 8a), which has exhibited marked G4 affinity and selectivity, was the subject of a structural investigation, in which the interaction of the complex with the telomeric sequence d((TTAGGG)4TT) was analysed using solution NMR methods (Fig. 8b, c) [80]. The ligand π-stacks on the 5′ tetrad in a pseudo threading/end-capping fashion and, thanks to its large footprint, interacts with three separate guanine bases within a single tetrad. In comparison with the native NMR structure (PDB: 2JPZ), the DNA loop regions in the bound complex have undergone structural rearrangement (RMSD = 3.4 Å) to accommodate the ligand, but the overall hybrid 2 topology is conserved. The Au2O2 central bridge is symmetrically stacked above the inferred K+ ion positions (the K+ locations cannot be determined directly from NMR data), and no direct gold-guanine base interactions were seen.

Fig. 8
figure 8

Dimeric gold complex. a Chemical structure and b NMR models of the gold complex structurally characterised with G4 DNA (PDB: 5MVB). c Superimposition of the G4 sequence upon binding by the ligand and as the native (PDB: 2JZP)

Unlike the metal salphens and other planar cationic species, the central metal ions in the available gold structures show no tendency to stack in line with the K+ channel. In the case of gold, the positive charge will be much more delocalised over the ligands, so this is not surprising. In the absence of steric limitations, in these cases, electrostatics and favourable π overlap influence the binding pocket more than the presence of a metal centre.

Octahedral ruthenium polypyridyl complexes

Octahedral metal complexes, principally those of ruthenium(II) with its 4d6 electron configuration, have also been examined as promising G4 binders. Unlike the square planar examples considered above, structural evidence shows the metal centre lying in a groove of the G4, with one polypyridyl ligand sitting on a terminal G-quartet, and two other ligands (often called ancillary ligands in the coordination chemistry literature) making contacts within the groove. The inherent three-dimensionality is a key property when aiming for topological specificity, since secondary DNA interactions with ancillary ligands can, enantiospecifically, change preference between topologies by interaction with strand polarity, syn/anti sugars, and loop regions at the binding site, resulting in a form of G4 intercalation. Modification of the structure of the intercalating ligand, by incorporation of specific functional groups or expansion of the ligand scaffold to larger π-extended ligands, leads to G4 specificity over duplex DNA. Work from our own group has demonstrated how these two approaches can lead to the design of metal complexes with such G4 specificity, combined with possibly useful photoproperties.

In 2019, we provided the first crystallographic evidence of two different mononuclear ruthenium polypyridyl complexes, rac-[Ru(TAP)2dppz]2+ (TAP = 1,4,7,10-tetraazaphenanthrene) (dppz = dipyrido[3,2-a:2′,3′-c]phenazine) and the analogue rac-[Ru(TAP)2(11-CN-dppz)]2+ bound to tetramolecular G4s [81, 82]. The DNA sequence d(TAGGGTTA) assembles to give a tetrameric quadruplex which is parallel-stranded when unbound, but which crystallised antiparallel, each lambda metal complex stabilising one syn-guanosine conformation, to give an assembly with four metal complexes and four DNA strands (Fig. 9a). The delta enantiomers of the complex do not bind to the G4 at all, but stack at the end of the assembly in the crystal lattice. Remarkably, the truncated sequence d(TAGGGTT) crystallised with the pure lambda enantiomer of the unsubstituted complex [Ru(TAP)2(dppz)]2+ to give a strikingly different structure in which there are four different metal complex environments, none involving any contact between the dppz ligand and the G-quartets. This result unexpectedly confirmed the strong effect of dppz substitution by the cyano group, the electron withdrawal by this group enhancing the donor-acceptor nature of the stacking interaction.

Fig. 9
figure 9

Luminescent ruthenium complexes. a, b Chemical structure of [Ru(phen)2(11-CN-dppz)]2+ ruthenium complex by itself and bound to the tetramolecular d(TAGGGTTA) G4 (PDB: 5LS8), as the TAP analogue. Lambda enantiomers of the complex are represented as spheres. c Λ-[Ru(phen)2(11-CN-dppz)]2+ bound to the sequence d(TCGGCGCCGA) (PDB: 6HWJ)

Structural analysis of the binding of Λ-[Ru(phen)2(11-CN-dppz)]2+ to the duplex-forming d(TCGGCGCCGA) helped to interpret the structure-selective luminescence behaviour of this metal complex upon G4 binding compared to duplex DNA (Fig. 9) [83].

It can be seen from the structure in Fig. 9b that, when bound by end-stacking to the G4, the intercalating ligand is well protected and embedded by the G4 DNA structure. In this case, the binding mode is truly intercalative because the antiparallel assembly contains AT base pairs. Therefore, no quenching by the surrounding water molecules is possible. On the other hand, when bound to duplex DNA (Fig. 9c), the -CN group on the dppz ligand protrudes into the major groove. This structural analysis is useful to understand the difference in luminescence behaviour when the light-switch complex Λ-[Ru(phen)2(11-CN-dppz)]2+ interacts with both DNA forms. Indeed, we observe that the complex is non-emissive when bound to CT-DNA but luminescent when bound to G4 [81].

The second approach we have used to increase the binding affinity and specificity towards G4 structures consists of using larger π-extended ligands. This approach led to the synthesis of [Ru(phen)2qdppz]2+ (qdppz = 12,17-dihydro-naphtho-dipyrido-phenazine-12,17-dione) and its hitherto unknown TAP analogue [43].

Here, Λ-[Ru(phen)2qdppz]2+ (Fig 10a) was crystallised with a modified human telomeric G4 sequence (GGGTTA)2GGGTTTGGG in an antiparallel chair topology with 1:1 stoichiometry. A structural characteristic of this type of G4 topology is the non-planarity of the bases, which can be observed in Fig. 10b. This lack of planarity of the G-quartets suggests that ligands designed to target this specific topology could have some flexibility, in this case by the curvature of the qdppz ligand shown in Fig. 10c. Here, the ligand overlaps with all the four bases of the G-quartet and shows a bend of approximately 12°. [Ru(phen)2qdppz]2+ demonstrates high enantiospecific binding towards G4 DNA in solution. Replication assays show higher inhibition of replication for the Λ-enantiomer compared to the Δ towards both the native htel21 and the modified htel21T18.

Fig. 10
figure 10

Ruthenium complex with extended angled ligand. a Chemical structure of [Ru(phen)2(qdppz)]2+ ruthenium complex structurally characterised with unimolecular chair form G4 in b (PDB: 7OTB). c Superimposition of the structure containing Λ-[Ru(phen)2qdppz]2+ ruthenium complex (cyan) with the earlier tetrameric structure PDB: 5LS8 containing Λ-[Ru(TAP)2(11-CN-dppz)]2+ (grey)

Sometimes, X-ray crystallography can be a frustrating business due to the inherent properties of the crystals at hand. Although of no interest other than a warning, diffracting crystals can sometimes contain intractable disorder issues, which only become clear once data is collected and model building starts. The example now described is included because the result is relevant to the topic of this review, but the technical problems are only of interest to those interested in what are described as pathological crystals [84]. In this case, the same modified telomeric sequence was crystallised with the enantiomers of the linear analogue of the qdppz complex: linqdppz (Fig. 11). Unlike most of the examples discussed in this review, the crystallographic unit cell contained six copies of the bound G4, with an overall topology, as expected, similar to that shown in Fig. 10b. The data statistics included a strong warning that there was non-crystallographic symmetry (an almost exact repeat), and the problem is that it causes very atypical distributions of intensity in the diffraction data. The ‘almost-repeats’ have small differences which cannot be extracted from the experimental data. In this case, five out of the six assemblies can be successfully refined, leaving an intractable structural problem that is not acceptable in a structural database of primary data. The important and instructive result, worthy of note here, is that it is the delta enantiomer which is bound. The ligand in the resulting assembly is of the correct length for the G4 assembly, but the reason for the delta preference is not obvious from the structure. The ligand appears completely planar, in contrast to the bending seen with qdppz.

Fig. 11
figure 11

Ruthenium complex with extended linear ligand. a Chemical structure of [Ru(phen)2(Linqdppz)]2+ ruthenium complex crystallised with the antiparallel chair form G4 formed from the modified telomeric sequence d(GGGTTAGGGTTAGGGTTTGGG). bd different perspectives of the structure containing Δ-[Ru(phen)2(linqdppz)]2+ ruthenium complex with the G4. Three potassium ions are clearly present (purple spheres)

Inclusion of such large hydrophobic ligands often negatively affects solubility and subsequently bioavailability, so additional metal centres can be coordinated to offset this. Such bimetallic systems may have slow binding kinetics, which can make NMR structure determination a realistic option. Species such as [Ru(phen)2(tpphz)]2+ (tpphz = tetrapyridophenazine) (Fig. 12) and the dinuclear derivative [{Ru(phen)2}2(tpphz)]4+ were reported earlier for their explicit quadruplex luminescence responses and potential as in cellulo probes; the latter of which has also been structurally evaluated by Thomas et al. [85]. Using a combined NMR-MM methodology, the group investigated the enantiospecific binding of ∆∆/ΛΛ-[{Ru(bpy/phen)2}2(tpphz)]4+ to the anti-parallel basket formed when Na+ is the central cation with the telomeric sequence d(AG3(TTAG3)3). In previous work, they noted high affinities for the system and determined an intense blue-shifted luminescence response of the racemate that had been attributed almost entirely to binding to antiparallel topologies with longer diagonal loops (≥3 nucleotides) [86]. They observed that the ΛΛ-isomer of the phen analogue is responsible for the bulk of the response (6-fold higher than ∆∆ at saturation), highlighting the enantiomeric differences in the interaction. Unfavourable relaxation rates hampered the NMR studies of the phen derivative, with primary NOE signals interpreted as binding of the two enantiomers to opposite G-tetrads. NOE derived structures and unconstrained MD simulations were successful in generating bonding models for the bpy analogue. Little perturbation of the native DNA conformation was seen upon binding of the ∆∆-enantiomer, and the complex was shown to stack on the opposite ‘loopless’ G-tetrad in end-capping mode. Conversely, the ΛΛ-complex was modelled as threaded through the diagonal loop of the basket topology. The guanosine residues adjacent to the central diagonal loop are modelled as somewhat perturbed by the threading, buckling the distal G-tetrad and creating a tight binding cavity around the chromophore. Ancillary ligand interactions with neighbouring ribose sugars are observed, and it is clear to see that assuming the same binding modes, these secondary interactions would be enhanced by the larger π-surface of the phen analogue. Generation of the Van der Waals surfaces for both structures highlights the encapsulation of the ΛΛ-chromophore, in contrast to the solvent-accessible binding of the ∆∆, and providing a structural rationale for the observed higher luminescence of the ΛΛ-enantiomer.

Fig. 12
figure 12

Diruthenium complexes. Chemical structure of the Ru complex that has been structurally characterised with G4 DNA, and NMR models of the interaction of the two enantiomers with the sequence, d(AG3(T2AG3)3). In both cases, the complex end-caps the tetrad stack; the ΛΛ-complex threads through a diagonal loop and generates additional π-stacking with ancillary ligands. The increased luminescence response of the ΛΛ-Ru in relation to the ∆∆-Ru isomer has been attributed to the additional encapsulation of the chromophore by the diagonal loop

Platinum metal complexes

The general structure of platinum “tripods” consists of a central non-planar tertiary amine in possession of three long pendant arms or triphenylamine (TPA) tripods, comprising three aromatic rings and capped with platinum-centred units. These triphenylamine tripods are of particular interest as they have been known for a long time to be both efficient DNA minor groove binders, due to their particular three-dimensional structure, and promising materials for two-photon absorption application (2PA) useful for DNA staining and cell death imaging [87,88,89,90,91,92,93].

The spectroscopic properties of TPA, in addition to their structural similarity to those of other G4 binders, inspired Garcia-España et al. to design appropriate TPA to selectively target DNA G4 structures [94, 95]. Further conjugation of TPA to peripheral platinum centres was undertaken to enhance the ability of these organic molecules to produce reactive oxygen species (ROS) due to the heavy atom effect of the platinum atoms. This combination led to the development of platinum tripods as promising for DNA-targeted photodynamic therapy [96,97,98]. Platinum tripods can indeed interact with DNA in the nucleus, inducing ROS generation upon light irradiation, with consequent DNA damage and cell apoptosis. This dual functionality makes platinum tripods effective not only in targeting specific DNA structures but also in promoting cell death through oxidative stress.

Zong-Wan Mao et al. reported a platinum-based tripod capable of binding with fair specificity, to the hybrid-1 telomeric G4, and this ligand-mediated stability effectively inhibits the activity of telomerase in vitro (IC50 = 1.22 µM) shown by a TRAP-LIG amplification assay.

NMR structural studies have defined two different binding stoichiometries (Fig. 13). The hybrid-1 structure has two distinct G-quartet ends, which can be distinguished as the 5ʹ and the 3ʹ ends. At 1:1 stoichiometry, the complex binds at the 5ʹ end, off-centre from the helical axis. The platinated arms protrude through the grooves of the quadruplex, with one of the arms partially enveloped by an A·A·T triad, possibly defining the selectivity. At the higher stoichiometry (right-hand panel), a second Pt-tripod unit stacks similarly, on the 3′ tetrad face. Even though this is a solution structure, the NMR data is interpreted by these authors as a dimeric assembly, linked by the terminal bases. The complex itself is a departure from the norm for quadruplex binding agents; containing no extended planar π-surfaces, and containing active metal centres that are designed to protrude away from the central quadruplex stack. The platinum centres are not directly involved in G4 recognition.

Fig. 13
figure 13

Platinum tripods. The non-planar Pt-tripod complex has been investigated in the presence of the human telomeric sequence, d(A3(G3T2A)3G3A2), using NMR and subsequent NOE restrained MD. Two discrete structures were obtained from different stoichiometries, highlighting two distinct binding pockets but an overall preference for binding to the 5′ tetrad. Note the shifted location of the tripod in relation to the terminal tetrad; presumably to increase π-stacking and allow the platinated arms to fit into the grooves without dislocating the DNA backbone

Petitjean et al. in 2021 reported the first crystal structure of a platinum(II) complex with the 22AG telomeric sequence crystallised in K+ buffer [99]. This complex, identified via a high-throughput screening of a G4 stabilisation assay, demonstrated high G4 stabilisation while preserving good G4-duplex DNA selectivity. This compound is more typical of the planar metal complexes known to bind to G4s, with two extended π-systems giving a good match to the G-quartet surface.

The crystal structure shows the interaction of two metal complexes per G4 via π- π stacking. In such a structure, the platinum ions line up with the potassium ions within the G4 dimer, as shown in Fig. 14c, d. It shows high binding affinity and selectivity towards G4 DNA.

Fig. 14
figure 14

Planar platinum complexes 1. a Structure of the planar Pt(II) complex crystallised with the 22AG telomeric DNA sequence. b, c Two different perspectives of the crystallised system (PDB: 6XCL), top and lateral view, respectively

Related platinum complexes have been studied by NMR; in this context, Zong-Wang Mao et al. have recently reported three different NMR structures of platinum compounds targeting G4 DNA structures [100,101,102].

Particularly interesting is the uncharged and non-planar platinum complex identified as Pt1 (Fig. 15a). Upon chloride loss and structural change, this complex has been demonstrated to bind G4 DNA selectively. The formation of the N-Pt bond creates a cyclometalated, positively charged and planar complex (Fig. 15a). Its interaction with G4 DNA has been demonstrated using cell studies, fluorescence lifetime imaging microscopy (FLIM), and NMR solution studies, where the NMR structure of Pt1 with VEGF G4 DNA has been determined (Fig. 15b). VEGF has a parallel topology, and the planar complex end-stacks on the 3ʹ side with the planar complex almost encapsulated.

Fig. 15
figure 15

Planar platinum complexes 2 a Structural changes that Pt1 undergoes when targeting G4 DNA. b NMR structural studies of Pt1 and VEGF G4 DNA (PDB: 6LNZ)

Two more NMR structures were published by Zong-Wang Mao et al. where two different organic-platinum(II) hybrid complexes were designed to specifically target quadruplex-lateral duplex hybrids (QLDHs), L1-Pt(dien) and L1-transpt [101, 102]. Specifically, MYT1L (a guanine-rich oligonucleotide sequence from intron 22 of the MYT1L gene) was used for the NMR studies.

Both platinum(II) complexes present a “chair-type” conformation and show high affinity towards QLDHs, with little affinity toward the other types of DNA topology, such as duplex DNA and G4. Both molecules have similarities in the mode of binding; indeed, both platinum complexes intercalate at the quadruplex-duplex interface and interact through π- π stacking with the interfacial G-tetrad of the G4 and the base pair of the duplex as shown in Fig. 16. While the [Pt(dien)(py)] group of L1-Pt(dien) interacts through hydrophobic interactions with the minor groove of MYT1L due to its specific three-dimensional structure, the counterpart of L1-transpt covalently binds the G6 of MYT1L highlighted in Fig. 16.

Fig. 16
figure 16

Platinum hybrid complexes– Platinum(II) complex L1-Pt(dien) and L1-transpt top and bottom, respectively. Solution NMR structures of both complexes have been obtained with a quadruplex-lateral duplex hybrid sequence MYT1L

Summary – binding modes and techniques

In this short account, we summarise structural data on the binding of metal complexes, and the somewhat limited information currently available on their therapeutic potential. Much work is certainly necessary to demonstrate any direct link between any of the structures reported here and what is known as the mechanism of action at the molecular level, giving much scope for innovative ligand design.

In general, the mode of binding can be described as end-capping, with the extended chromophore of the metal complex sitting on one or both G-quartet faces, and binding one or more metal complexes.

For nickel, cobalt, gold and platinum, the metal geometry is square planar, and the metal is often not directly aligned with the central potassium channel, although it may have seemed attractive to aim for such an arrangement. One possible reason is that the central channel is defined by the guanine carbonyl groups at the 6 position, and these are metals with a relatively low oxygen binding preference, typically preferring nitrogen donors. In general, such complexes bind to the parallel topology of the G4, which has only lateral loops with limited scope for interaction with the metal complex. This loop arrangement creates four approximately equal-width grooves and has approximately fourfold symmetry. The metal complexes described here often do not have such high symmetry themselves, and disordered binding orientations are seen. Improvements to create specific recognition could include a deliberate strategy to interact with loop regions to give a more ordered binding, though that may not correlate with more useful therapeutic properties as there are so many other factors to consider, such as membrane permeability/lipophilicity. The hybrid platinum complexes of Fig. 16 show how a specific feature, in this case, the duplex-quadruplex junction region, can be specifically targeted.

A different set of features are seen with the octahedral ruthenium complexes, both mono- and dinuclear. Here, the three-dimensional geometry leads directly to enantiospecific binding and, with the lambda enantiomers, the stabilisation of antiparallel geometry resulting from the switch to a syn-guanine conformation. These structures, therefore, form a distinct group on their own, of interest not only for their topological preference but also for the photophysical properties and their dependence on binding mode, somewhat outside the scope of this review.

In conclusion, it is important to emphasise that both X-ray diffraction and NMR spectroscopy are indispensable techniques for understanding how G4 ligands bind to their targets. Each method has its own set of advantages and disadvantages.

X-ray crystallography requires the growth of high-quality crystals for single crystals X-ray diffraction which can be challenging due to factors such as sample purity and solubility. Additional general limitations include the size and the quality of the crystal as well as the topology in solution of the G4. DNA sequences, such as G4 forming sequences, must normally be present in solution as a single topology in order to crystallise (requiring overnight annealing), and in some cases, perhaps a specific ligand will stabilise a specific topology. An X-ray structure is normally ab initio in the case of the ligand bound structures described here. The method of molecular replacement – the use of a known structure as a starting model – is normally unsuccessful, and multiwavelength methods are often used [103]. In return for this effort, though, a complete picture of the assembly, including bound water molecules and cations, is the typical result. In the case of the structures described here, the accurate coordination geometry and identity of the bound central cation of the G4 (normally K+) can be analysed, and the precise ligand conformations, including the level of disorder, determined directly from the refinement. The large number of independent observations (reflections) means that precise error statistics are available, and that it is very difficult to fit a wrong model.

The resolution (minimum d-spacing) of the measurements, reported for each structure in Table 1, is a reasonable guide to quality, thus, the structure with code 7OTB should be the best, purely in terms of data quality, of this group of structures [43]. In that work, as well as three central K+ ions, a further K+ and Ba2+ were located, along with 113 ordered water molecules per assembly. Such details aid in the optimisation of ligand design, given that ligands will often have H-bonding properties of their own.

NMR spectroscopy is in general limited to relatively small assemblies such as G4-ligand complexes, and can require a relatively large amount of sample to achieve an acceptable signal-to-noise level. It is irreplaceable in the many cases, particularly where individual G4s are involved, where crystallisation does not succeed, and it was interesting to see, when preparing this review, that in this specific area, it is not possible to say that one technique dominates. Both will continue to be essential methods, especially since we are a long way from being to make reliable predictions of ligand binding by metal complexes, even given recent advances [104]. An excellent example is provided by PDB code 2MCC. The residence time of the complex in the G4 binding site was long enough for NMR measurements (not always the case) [85]. Here, it provides a reliable solution structural model, based mainly on the nucleic acid component, but requiring some computational input as well. Further, NMR is the key technique for providing detailed information about the dynamics and flexibility of macromolecules in solution. Therefore, both techniques are complementary and crucial for the three-dimensional structure determination in this and other nucleic acid projects, and for their application in drug design. They often aid in the development of more potent binding agents by pinpointing structure-specific binding features.