Abstract
The tumor suppressor protein p53, a transcription factor playing a key role in cancer prevention, interacts with DNA as its primary means of determining cell fate in the event of DNA damage. When it becomes mutated, it opens damaged cells to the possibility of reproducing unchecked, which can lead to formation of cancerous tumors. Despite its critical role, therapies at the molecular level to restore p53 native function remain elusive, due to its complex nature. Nevertheless, considerable information has been amassed, and new means of investigating the problem have become available.
Objectives
We consider structural, biophysical, and bioinformatic insights and their implications for the role of direct and indirect readout and how they contribute to binding site recognition, particularly those of low consensus. We then pivot to consider advances in computational approaches to drug discovery.
Materials and methods
We have conducted a review of recent literature pertinent to the p53 protein.
Results
Considerable literature corroborates the idea that p53 is a complex allosteric protein that discriminates its binding sites not only via consensus sequence through direct H-bond contacts, but also a complex combination of factors involving the flexibility of the binding site. New computational methods have emerged capable of capturing such information, which can then be utilized as input to machine learning algorithms towards the goal of more intelligent and efficient de novo allosteric drug design.
Conclusions
Recent improvements in machine learning coupled with graph theory and sector analysis hold promise for advances to more intelligently design allosteric effectors that may be able to restore native p53-DNA binding activity to mutant proteins.
Clinical relevance
The ideas brought to light by this review constitute a significant advance that can be applied to ongoing biophysical studies of drugs for p53, paving the way for the continued development of new methodologies for allosteric drugs. Our discoveries hold promise to provide molecular therapeutics which restore p53 native activity, thereby offering new insights for cancer therapies.
Graphical Abstract
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The tumor suppressor protein p53 is at the forefront of preventing cancers mainly through its interaction with DNA. When the cell has suffered an insult such as DNA damage, its primary role is to determine cell fate, either cellular repair or apoptosis, both with the goal of removing faulty cells from the gene pool. Mutation of p53 leaves compromised cells vulnerable to proliferate unchecked, leading to cancerous tumors. An estimated 50% or more of all human cancers are attributable to a mutation in this protein alone (Vogelstein et al. 2000), a fact which has earned it the title “Guardian of the Genome” (Lane 1992). Cancer is regarded as the second leading cause of death globally, accounting for about 9.6 million deaths, or just under 20% of all mortalities in recent years (Islami et al. 2021; Sung et al. 2021; Wild and Stewart 2014). The importance of curing cancer or at least advancing treatments is thus plainly evident.
While many labs and even entire institutions engaging the foremost scientists of our time work to find increasingly effective ways of dealing with the illness, surprisingly few advances for molecular-based therapies for curing cancer have emerged with any degree of success. While p53 is a complicated protein acting as a single, albeit important, hub in a wide regulatory network (Vogelstein et al. 2000), its direct control of genes can be cast in a simplistic view accessible to molecular biophysicists: it is a transcription factor binding to DNA, and when mutations occur, changes in the ensemble of accessible structures change, resulting in incorrect binding (Lane 1992). The challenges for molecular biophysicists to contribute to this field lie in the areas of understanding allostery; studying how a change far from the DNA binding interface can cause a change in the binding, and how to reinstate the native activity through drug design. New advances in graph theory (Abramson 2021; Novak and Gibbons 1999; Prathik et al. 2016) and networks (Aggarwal et al. 2021; Han et al. 2022; Novak and Gibbons 1999) can be applied to biomolecules to consider how mutated molecules may react. Additionally, machine learning techniques can discern subtle allosteric signals amidst large datasets. These methods can for the first time be applied to biomolecular problems using molecular simulations. In this review, we consider the current state of understanding of the structure and function of p53. We then proceed to understand how mutations may disrupt crucial interactions, and what efforts have been made at the molecular level to restore activity. Finally, we consider cutting-edge ideas in the field coupled with recent advances in graph theory and machine learning, which we envision as an exciting and novel path forward to open new possibilities to design molecular-based therapeutics to restore p53 activity. It is an urgent and timely engineering problem requiring the cooperation of those in many disciplines. We consider this in light of p53, while realizing the possibility for translational information to impact the broader engineering and drug design fields to not only positively impact numerous currently undruggable diseases, but also as a means of allosterically designing the function of molecules.
p53 structure
Insights from the first crystal structure
Much of the insight into the mechanistic working of p53 has come from structural studies, most notably starting with X-ray crystallography. p53 is a 393 amino acid protein encoded by the TP53 gene which was discovered in 1979. It was identified as a target of SV40 tumor antigen (Linzer and Levine 1979); soon after, it was cloned and sequenced (Zakut-Houri et al. 1985). The full-length protein can be considered in terms of functional domains: the transcriptional activation domain (TAD, residues 1–62), which comprises a little over half of the N-terminal intrinsically disordered region of the protein; a proline-rich region (residues 63–69); the DNA binding domain (DBD, residues 102–292); a tetramerization/oligomerization domain (TD, residues 320–360); and the intrinsically disordered C-terminal (residues 361–393).
The proline-rich region contains an especially high number of proline amino acids, noted as the only amino acid whose side chain loops back to the backbone, giving it unique backbone angle accessibility readily identified in Ramachandran plots. This region has been associated with the apoptotic activity of p53, but again how this intrinsically disordered region confers activity is unexplained (Baptiste et al. 2002).
The DNA binding domain intersperses between the two intrinsically disordered regions. It forms a well-structured albeit thermally tenuous beta-barrel (Butler and Loh 2006). The tetramerization domain is believed to facilitate the quaternary association of four monomers to form a tetrameric structure, and a chemically basic region in the remainder of the C-terminus (Hamard et al. 2012; Jeffrey et al. 1995). The C-terminal region regulates p53 between a latent and active state where p53 is inactive and active for DNA binding, respectively, though the exact mechanism remains unclear (Ayed et al. 2001). Post-translational modifications to the C-terminal region modulate DNA binding affinity; p53 isoforms without the C-terminal region have higher specific sequence binding affinity (Appella and Anderson 2001; Ayed et al. 2001; Gu and Zhu 2012). While the C terminus is well established in receiving PTMs, how these regulate the DBD remains largely unclear in light of NMR evidence suggesting no conformational difference in the presence or absence of the C-terminal region (Ayed et al. 2001; Hamard et al. 2013).
Intrinsically disordered regions (IDR) of proteins (Oldfield and Dunker 2014) are highly mobile parts evading conventional structural determination. However, they bear considerable biological importance. In p53 they comprise about half of the protein and are the major sites of post-translational modification (PTM) regulation (Gu and Zhu 2012; Hamard et al. 2013; Oldfield and Dunker 2014). Despite lacking static structure, IDRs have a multi-funnel energy landscape (Chebaro et al. 2015), indicating that they can adopt specific structures, similar to the ordered nature of a static structure, but they more broadly sample conformational space. Although the dynamic nature of IDRs makes them largely inaccessible to experimentation, they may have been selected by nature as versatile control switches with multiple rather than binary options (Oldfield and Dunker 2014), as seems to be the case in p53. How the interdomain communication works from an intrinsically disordered region to the DBD remains largely unexplored.
The core of the protein was first crystallized in 1994 (PDBID 1TUP) by Cho et al. (1994) providing considerable atomic level detail of the structure of the protein and its interaction with an engineered consensus DNA sequence. Within the DBD, p53 exhibits classic beta barrel structure, along with some key structural features. In addition to the beta sheets making up the beta barrel, it contains a variety of important structural elements involved with DNA binding: the L1 loop, the L2 loop, the L3 loop, and the strand loop helix (Fig. 1B). Furthermore, it chelates a zinc via four coordination residues C176, H179, C238, and C242 which are located in the loops L2 and L3 (Blanden et al. 2015). The zinc is obligate; without these tether points, the structural integrity collapses in a short period of time, specificity is lost, and p53 tends to aggregate (Butler and Loh 2003).
H-bond interaction surface
Furthermore, the 1TUP structure (Fig. 1) (Cho et al. 1994) (2.2 angstroms resolution, R factor of 20.5%) identifies eight residues directly H-bonding with the DNA: K120, R241, K248, K273, A276, A277, R280, and R283. To facilitate crystallization, a sequence of high consensus was used instead of a biological sequence, which, as shall be addressed below, tend to deviate from the more optimally interacting consensus by having fewer H-bonds between the protein and DNA. In this binding interface, six of the eight interactions mediate protein side chain H-bond interactions to the DNA backbone: K120, S241, R248, R273, A276, R283. Only three residues, K120, C277, and R80, have capabilities of making discriminate interactions occurring in the DNA major groove. K120 falls in both categories due to the formation of two distinct H-bonds. In addition to studying p53 bound to DNA, the corresponding unbound version is also available (PDBID 2OJC) (Wang et al. 2007). The crystal structure preempts a major question in the p53 field: only three positions of the binding site—the decamer helix directly interacting with protein side chains—appear to have sequence discrimination, so how is p53 able to identify binding sites? This question will be revisited in the forthcoming sections.
K120 alternate conformations
Further crystallographic studies captured two distinct conformations of the L1 loop leading to differences in K120 binding to DNA (Petty et al. 2011). Interestingly, these were captured within the same crystal; because of the space group of the crystal, two different p53 monomers were present within the same unit cell. The extended conformation and the recessed conformation in chains B and D respectively (PDB ID 3Q05) both appear under the same environmental conditions, suggesting that the two conformations may be energetically interchangeable. Further investigation suggested that the loop appears to primarily form the extended conformation when bound as a monomer and when free in solution. The recessed form again appears in a tetrameric form bound to DNA (PDB ID 4MZR) in a study of an engineered sequence (Emamzadah et al. 2014).
Hotspots
p53 is known to have several positions that, when mutated, have an exceptionally high prevalence in cancerous samples (Fig. 1). Such data has been collected by sequencing of p53 originating from tissue samples from human cancer patients and is curated by the IARC database (Hernandez-Boussard et al. 1999; Olivier et al. 2002) as an ongoing project. The following ten mutations have emerged as the top frequency hotspots: R175H, R248Q, R273H, R248W, R175L, Y220C, R273C, R282W, R248L, R175P. Of these, the top six account for 30% of all mutations. Interestingly, these mutations localize to the DBD. A potential explanation for this is offered in the following section. Although the Y220C mutation is not in the most frequent group, it is significant for its role in drug discovery; namely, the drug PK11000 has been shown to restore wild-type function in the Y220C mutation in vivo, eliminating cancerous tumors (Bauer et al. 2016; Zhou et al. 2023).
p53 oligomerization, quaternary structure, and the tetramerization domain
Similar to many transcription factors, p53 can oligomerize to form quaternary structures (Fig. 2). This is facilitated through the tetramerization domain (TD), which, as noted above, is found in the disordered C terminal region of p53. It has been crystallized in the form of four fragments (Jeffrey et al. 1995). Isoforms of p53 lacking the TD are still able to bind to DNA and initiate transcription. However, the binding affinity of isoforms without the TD is 10 to 100 times lower than those with a TD (McLure and Lee 1998). Tetramerization of p53 is required for at least some functions of p53. For example, ubiquitination—a universal signal for nuclear export and localization to the proteasome for destruction—of p53 requires it to be oligomerized; the nuclear export signal exists between residues 340 and 351, which is only exposed when p53 is monomeric and buried when it is tetramerized.
The TD is highly sensitive to mutations due to compounding destabilizing effects between four copies of the protein. Thus, even a single mutation in each of the four TDs is highly likely to at least weaken if not totally inhibit oligomerization. For this reason, a mutation in the TD may be as devastating to p53 wild-type (WT) function as a mutation in the DBD is, but as mentioned above, TD mutations do not number among the cancerous hotspot mutations. Many p53 gain-of-function mutations that result in cancer require oligomerization and the function of the TD and/or the C-terminal region; consequently, few cancers are found to be caused by point mutations in the TD which would inhibit oligomerization (Chène 2001).
While the tetrameric form is generally thought to be the biological form, there is controversy on the matter (Okorokov and Orlova 2009). Two structural studies contrast in the relative orientation of the monomers which leads to differences in the relative position of the binding interfaces. The first model, from X-ray crystallography, purports the monomers lay on the DNA in a head-to-head fashion, such that they contact the DNA in succession (Kitayner et al. 2010, 2006). The consensus sequence is made up of two half-sites, which are themselves made up of two head-to-head quarter sites (→ ← → ←) (McLure and Lee 1998). It follows that when p53 does bind to the DNA consensus sequence as a tetramer, it does so as a dimer of dimer of p53 monomers. The monomer-dimers bind at consecutive quarter sites, rather than alternating quarter sites.
To further confuse matters, one DNA half-site appears to be sufficient in binding to the consensus sequence (Ly et al. 2020), but the presence and simultaneous bonding of a second dimer in the tetramer has been reported to increase the binding affinity by 50-fold (McLure and Lee 1998). This, however, is somewhat surprising when juxtaposed with a genomic ChIP assay in which monomeric sites were often sufficient within the genomic DNA for competitive binding and subsequent immunoprecipitation.
In contrast to the crystallographic head-to-head model, the CryoEM structure of the full-length p53 at 13.7 angstrom resolution best characterizes p53 as two oppositely arranged dimers (Okorokov et al. 2006) (PDB: 1igt). This opens the possibility that p53 could bind DNA at two distinct interfaces on opposite sides, either as two distinct strands being bound simultaneously or a single strand that is bent back on itself to thread through twice. It is also possible that p53 could be capable of both of these structures, or even more, or bind in complexes, under different biological conditions.
Nevertheless, a binding interface involving a tetramer may confer overall improved binding for the complex in a kind of cooperativity. There are at least three plausible explanations for this increased binding affinity. First, interactions between the four TDs create a conformational change that enhances binding, and second, the presence of a second dimer simply increases the overall enthalpic interaction the tetramer has on the DNA. Third, dimer–dimer interactions increase the likelihood of the tetramer finding a better consensus sequence than just a half site. The p53 tetramer would randomly sample DNA conformations, isolated quarter-sites having low binding affinity and half-sites having a higher binding affintiy. This would ensure that the tetramer would be able to sufficiently sample enough conformations at a consensus sequence and increase the likelihood of a successful binding. This complex would be reinforced with dimer–dimer interactions (McLure and Lee 1998). It is essentially a thermodynamic argument in which the binding of one monomer decreases the configurational search space for subsequent attached monomers to locate DNA to bind.
Response elements
A response element can be defined as nucleotide sequences recognized by regulatory transcription factors, which then lead to gene response to various regulatory elements. p53, serving as a transcription factor binding to defined DNA sites, activates gene expressions in stress response, which can vary from cell cycle control, DNA repair, and apoptosis. p53 creates complexes with either consensus or natural response elements, with the complexes having similar binding affinity and specificity, as confirmed by binding competition against bulk genomic DNA (Vyas et al. 2017). Moreover, a question that remains unsolved is how p53 efficiently manages its functions in response to stress, given that the p53 target genes, involved in early stress response pathways, possess more flexible response elements when compared to genes which require strict regulation or those genes with outcomes happening later in stress responses. The flexibility of response elements contribute to the expression of p53 target genes, influencing the decisions of the system (Safieh et al. 2021).
p53–non-B-DNA interactions
Keeping in mind that p53 operates in the nucleus under conditions of DNA damage, it is not surprising that it can bind to alternative DNA conformations that abound in the nuclear environment. A typical cellular nucleus is about 5 to 20 µm in diameter, housing approximately 2 m (about 6 feet) of DNA. This feat is achieved by the wrapping of DNA around nucleosomes and interactions with other proteins that result in its supercoiling. As a result, much of this DNA is not in the canonical B-form (Corless and Gilbert 2016; Ravichandran et al. 2019); it may be in different double-stranded forms, such as A-DNA, C-DNA, or Z-DNA; there may be additionally a variety of other non-canonical forms encountered as the cell undergoes processes such as transcription and replication. While the DBD is responsible for making most B-DNA interactions, as well as Z-DNA interactions, triplex and G-quadruplex DNA interacts with C-terminal residues of p53. Hotspot mutations in the DBD cause conformational changes or directly interfere with the DNA binding interface, abolishing the ability of p53 to recognize and bind to DNA sequences. p53 exhibits structure-specific DNA binding properties; it has the ability to bind a wide array of DNA structures including positively and negatively supercoiled DNA, three-way and four-way junctions, telomere T-loops, hemicatenate DNA, and cruciforms (Brázda and Fojta 2019; Jagelska et al. 2010; Štros et al. 2004). Hemicatenate DNA (Štros et al. 2004) is an intermediate structure formed during DNA replication, repair, and recombination. Their primary feature is that one of the strands is longer than its partner(s), and hemicatenate DNA can be found in single-stranded, double-helical, or four-stranded DNA. p53 has been shown in vitro to bind to hemicatenate DNA, and can adopt one of three different complexes, suggesting that p53 has the ability to differentiate between types of hemicatenate DNA.
Sequences of mirror-symmetric homopurine homopyrimidine DNA strands have been shown to form triple-stranded structures. G/C-rich sequences have been shown to form two types of four-stranded structures: G-quadruplexes and i-motifs. Both triplex and quadruplex structures occur non-randomly throughout the human genome; triplex sequences inhibit transcriptional regulation, and G-quadruplex structures are significant in DNA replication, telomere maintenance, and transcriptional regulation. p53 has been shown to have high affinity for binding to triplex and G-quadruplex structures (Adámik et al. 2016).
Crystallographic plausibility argument for molecular-based drug design
Crystallographic studies have also shed insight into the possibility of rational drug design to restore the activity of p53 mutants. The mutations R249S and T123A together are known to disrupt protein activity (Joerger et al. 2005; Nikolova et al. 2000; Suad et al. 2009). It has been reported that the distal H168R is capable of restoring normal p53 binding for these joint mutations. The study illustrates the principle that a mutation can be “undone” by the introduction of another perturbation to the system.
Other work has focused on the Y220C mutation site; while it is not a top-ranking hotspot, it has been a focus of study due to its prominent role in p53 restoration studies. One key feature of this mutation is a loss of an aromatic ring, and thus, one might reason a strategy to reverse the problem and restore the WT function would focus on the reintroduction of the aromatic back into the system as a sort of molecular prosthetic. To this end, compounds such as PK083, PK7088, and PK11000 emerged (Bykov et al. 2018, p. 53; Lopes et al. 2019; Stahlecker et al. 2022). PK083 and PK7088 did not yield the desired results, but PK11000 did in an unexpected way: rather than locating at the site of the mutation, PK11000 was discovered to bind covalently at 182, distal from both the mutation site and from the DNA (Bauer et al. 2016). In rodent models, it convincingly eradicated tumors involving Y220C (Bauer et al. 2016). Unfortunately, it has not been able to pass clinical trials, likely due to side effects from indiscriminately covalently binding to off-target thiols such as solvent-accessible cysteines of other proteins. These advances have hinted at the possibility that an active site of a transcription factor such as p53 could be modulated by using principles of allostery.
Biochemical and bioinformatics studies investigate p53 indirect readout mechanism for site discrimination
Biochemical and bioinformatics studies have also substantially contributed to the structure and function knowledge of this protein. Up to this point, several lines of evidence have begun to hint that p53 binding to target DNA may not follow a simple predictable lock and key mechanism.
As with many transcription factors, p53 binds a variety of DNA binding sites for which a general pattern emerges. The general consensus for p53 binding sites is 5′-RRRCWWGYYY-3′ where R = purine, Y = pyrimidine, W = A or T, and C and G represent their respective DNA bases (El-Deiry et al. 1992). This is consistent with the sequence of DNA in the 1TUP crystal structure (Cho et al. 1994). Such a sequence optimally complements the available H-bond donor and acceptor groups. In fact, simple models for computing the binding affinity of p53 to linear B-DNA sequences have been created (Brázda et al. 2016; Veprintsev and Fersht 2008).
Nevertheless, chromatin immunoprecipitation data suggests that p53 actually binds to a surprisingly wide array of sequences within the genomic context. In brief, the assay presents p53 with many genome fragments, which compete with each other to be bound by p53. Bound p53 is immunoprecipitated out and the DNA is extracted, then amplified and sequenced. Thus, a selection for sequences preferentially bound by p53 is obtained. Rounds of competitive binding, called SELEX, can also be applied to obtain an ever-increasingly selective binding pool (El-Deiry et al. 1992; Qian et al. 2002). The resource p53 BAER (Binding and Expression Resource) provides the results of such a ChIP genomic study at UC Santa Cruz Genome Browser (Kent et al. 2002; Nguyen et al. 2018; Raney et al. 2014). These studies uncovered some surprising results. The most commonly found p53 target fragment consisted of the consensus binding site without any spacers, as expected. However, about a quarter did not contain even one recognizable half-site (Nguyen et al. 2018). Furthermore, the binding site fragments could be mapped back to their original position in the genome map. As a transcription factor, a reasonable expectation might be that the majority of the binding sites should be located proximally to a transcription start site. However, even using a generous 5-kbp distance cutoff definition for this, only 35% of the sequences satisfied the criterion. An additional 25% of the sites were intragenic, and 41% fell in intergenic DNA (Nguyen et al. 2018).
As a theoretical comparison, one may produce a model equipped to score a genome for p53 binding sites based on the extent of consensus homology present. One such typical model predicts on the order of 800,000 binding sites with a 20-mer binding site definition plus a spacer between half-sites of 0 to 15, or only one half-site (Menendez et al. 2009). Compared to the ChIP results, this is a gross estimation of binding sites. However, this also makes the important point that p53 is not just searching for the minimum consensus, but is somehow discriminating between the sites that apparently seem to match the consensus, but were not selected in the experiment. Thus, a complex picture of p53–DNA interactions emerges. The study raises the interesting question of why p53 appears to be predominantly preferring DNA sequences that do not have recognizable homology to the consensus and not binding in promoter regions. It hints that p53 may be recognizing more than just sequence and that the situation may be more complex.
Indirect readout via L1 loop dynamics
As the evidence amasses, sequence homology alone can be sufficient for consensus sequences to be identified. However, for the majority of the sequences, this has not been the case, as we have argued above. An induced-fit mechanism could potentially be at work, and such an idea has been investigated for the L1 loop with Lysine 120 based on the differential conformations observed from the crystals (3Q05) mentioned above. Because the L1 loop is highly dynamic and two conformations are observed, the hypothesis that it could be a conformational switch as part of the greater recognition has been pursued (Lu et al. 2007; Petty et al. 2011). There are many energetically feasible conformations in dynamic interchange in the literature, which has also been our observation (Han et al. 2022; Ho et al. 2006; Joerger and Fersht 2007; Lukman et al. 2013; Slaw 2015). When the K120 contacts the DNA, it has the possibility to make two hydrogen bonds, contributing substantially to the enthalpy of binding at an interface with only 8 contacting residues expected as originally reported, and remains consistent with what we have seen in molecular dynamics (Han et al. 2022; Petty et al. 2011; Safieh et al. 2021; Slaw 2015). Petty et al. proposed an induced fit mechanism for this region and tested the hypothesis by measuring both the binding and unbinding kinetics. The expectation is that the binding affinity or by proxy the kinetic on-rate would be the driving factor for interaction that would differentiate between sequences. However, they report that the kinetic on-rate appeared indistinguishable between known p53 targets and p53-non-targets. They explained the anomaly via L1 loop dynamics. They surmise L1 loop dynamics play an integral role in the target sequence identification. Consistent with the kinetic off rate, the L1 dynamics aided in determining when p53 should disengage with the DNA. Thus, the overall scheme that arises from this view is that p53 indiscriminately binds onto DNA sequences. However, the operative question becomes whether it should remain bound; in other words, it is a time residency that becomes important. The dynamic L1 loop then is suggested by these experiments to probe the DNA to determine whether the p53 will remain and thus be a binding site, or disengage. The conformation of the DNA is a slightly bent shape could assist in the formation of the interaction, thus serving as an indirect readout component, as these researchers had initially set out to investigate. A scanning mechanism for p53 to locate its binding sites has been suggested by electron microscopy (Melero et al. 2011), and this interpretation seems consistent with that, considering that under normal cellular conditions, p53 is sequestered by MDM2. It is the cellular damage that releases p53 via MDM2 phosphorylation (Klein et al. 2021; Moll and Petrenko 2003; Momand et al. 2000; Thayer and Beyer 2016). This could pre-empt the apparent problem of ubiquitous p53 binding in the genome; when it is released, we may assume p53 binding is necessary because that is the condition of its release from MDM2, and otherwise, it is not present to do so at least in appreciable concentrations because it is bound by MDM2. When damage occurs, p53 quickly binds and scans the DNA, remaining at positions that either are binding sites by virtue of having excellent or close to perfect consensus, to optimize the enthalpic contribution, or binding by virtue of shape complementarity of the DNA or relative ease of the DNA to be deformed into such a conformation, with the assistance of the L1 loop training behind to potentially add up to 2 additional H-bonds to retain the p53 at the site.
Additive energy model
We pushed an additive energy paradigm as proof of concept by constructing a computational model to attempt to quantitate the enthalpic contributions and estimate the corresponding contributions from indirect readout whether including the L1 loop or otherwise in an additive energy model (Thayer and Han 2017). The model numbers the positions of the direct contact region and uses the TG step at positions 5 to 6 in the binding site as a proxy for bending/bendability on the grounds that YpR steps are known from DNA mechanics studies to be highly flexible and easily bend (Beveridge et al. 2012, 2004; Dans et al. 2019; Dixit et al. 2005; Lavery et al. 2010; Pasi et al. 2014). For example, the approximately 90° bend in the cyclic AMP receptor protein structure (PDB ID 1CGP) is almost entirely attributable to the TG base pair steps exhibiting characteristic unstable stacking allowing them to bend (Schultz et al. 1991). A scoring scheme loosely based on free energy units in kilocalorie per mole was devised, and all possible decamers representative of one p53-DNA binding interface, or minimally required one half-site, were scored. The Smeenk et al. (2011) dataset was also scored, which provided 482 genomic binding sites.
The results suggested partitioning binding sequences into several categories depending upon the components of their scores, knowing that they were all in fact genomic sites. The first category and the single most populous contained sequences closely following the consensus. The 1TUP sequence falls into this category and thus may be considered an important structural insight for the canonical interactions. The second category represented those that lost the canonical TG step, and therefore the ability to readily bend at that position; they would have poor indirect readout due to a DNA kink, which may be recuperated by other features. AG was observed as the only allowable substitution in this category, and the C in the eighth position became conserved. The third class contained those with TG substituted by any other YR step, expected to have higher than average but lower than TG flexibility (Bertrand et al. 1998; Beveridge et al. 2004; Dixit et al. 2005), leading to higher presence of C at positions 4 and 8. The fourth category obliterated K120 and was compensated by the rise of homology among the group at several other positions. This group serves as an interesting counterpoint to the observations from the ChIP studies and those of Petty et al. (2011); they offer a representation of a sequence that would not necessarily be identified as a “consensus.”
The authors point out the possibility of an alternate H-bond formation when the YR step is lost to the YpR. The conservation of C4 is strictly enforced by introducing a G in the opposite strand to the base pair which could bind with R248. However, no direct evidence that this occurs is available. Interestingly, in simulations of wild-type p53, mutant p53, and p53 rescued by a small molecule, R248 emerges as one of several H-bonds in found exclusively in the Y220C mutant (Han et al. 2022). This ties back to the observations of Petty et al. that the kinetic on-rate was indistinguishable, but it was the kinetic off rate that discriminated between binding and non-binding sites. Consistent with that overall model, when the TG step is lost, the interactions with K120 seem to be conserved, except that category that could potentially pick up binding contributions from R28.
Overall, these experiments suggest that p53 binding is complex and potentially achievable by not only thermally accessible molecular configurational ensemble but also could be considered from a bioinformatics-based ensemble of interactions. By a variety of mechanisms, these energetic terms may sum up to sufficient binding interaction energy with a satisfactorily low kinetic off-rate, so as to allow a substantive residence time to activate transcription of target genes when a cell is experiencing damage and it has been released from MDM2.
p53 isoform splice variants
Two additional factors, isoforms and p53 post-translational modifications (PTMs), may also regulate p53 interactions with DNA. Isoforms (Fig. 3) arise in eukaryotic gene expression when a multienzyme complex called the spliceosome edits an mRNA after it has been transcribed from the DNA but before it is translated into a protein (Wilkinson et al. 2020). In this way, the same DNA may encode several different protein sequences. Different isoforms of p53 have been found in different parts of the body, contributing to their differentiation (Bourdon 2007). The full length 393 amino acid variant is found in most tissues, yet some splice variants are tissue specific. For example, the splice variant Δ133p53ɑ is only found in the colon, bone marrow, testis, fetal brain, and intestine, and the variant Δ133p53ɣ is absent from the brain, heart, lungs, fetal liver, salivatory gland, breast, and intestine.
Human p53 has in total 12 isoforms with nine major forms, ranging from the full-length 393 amino acid wild-type to the much smaller residue number 94–312 fragment which approximately corresponds to the crystallized DBD fragment. The DBD is present in all isoforms; the isoforms vary in the truncations of the N and C terminal disordered regions. Isoforms have been suggested to play a role in cancer development (Bourdon 2007; Bourdon et al. 2005; Steffens Reinhardt et al. 2023). To begin to understand the mechanism of how this might arise at the molecular level, MD simulations of a selection of isoforms indicated that the structural behavior of the DBD particularly with respect to its interaction with DNA was modulated by the different isoforms through an allosteric mechanism, since the changes occur in the intrinsically disordered regions and not directly in the DBD. This suggests that the splice variants contain a hard-coded regulation that may contribute to the differences in p53 behavior in different tissues, which may be related to the differences in tumor prevalence depending on where in the body they arise (Armour-Garb et al. 2022).
p53 PTMs
P53 has been likened to a decision-making circuit board integrating information about the cell status from multiple enzymes reporting on the cellular status (Sullivan et al. 2012). Events compromising the genetic material, such as DNA double-strand breakage, formation of thymidine dimers by irradiation, hypoxia, heat shock, and exposure to radioactive chemicals, lead to the activation of p53. A mechanism by which this occurs is the direct modulation of p53 activity and specificity by PTMs (Joerger and Fersht 2008; Okorokov et al. 2006). PTMs are covalently attached (or removed) functional groups transferred by enzymes to specific residues in the protein, mainly in the N and C terminal regions. The most common PTMs are phosphorylation, methylation, acetylation, and ubiquitination. Several recent reviews provide an overview of PTM effects (Appella and Anderson 2001; Gu and Zhu 2012; Sullivan et al. 2012). Gu and Zhu provide a comprehensive diagram summarizing all currently known PTMs organized by predominant function. Once p53 becomes activated, it prevents the transmission of genetic errors to new cells by inducing cell cycle arrest and DNA repair, or, when cells are beyond repair, through initiating apoptosis. As a transcription factor (Bourdon et al. 2005), it carries out this role of determining cell fate by binding in the promoters of target genes associated with those outcomes and regulating their transcription (Okorokov and Orlova 2009). PTMs exert regulatory control over the sequence specificity of p53 in a complex combinatoric code on the N and C termini (Gu and Zhu 2012; Smeenk et al. 2011) contributing to the cell’s fate, but the molecular mechanism by which this occurs is not known. Furthermore, the binding sites of the two pathways are not separable on the basis of the DNA sequence. Thus, understanding the mechanism of PTM control will shed light on how the cell fates are determined and provide insight into how this might be manipulated by therapeutics, thereby inducing the eradication of tumors.
Ten PTMs independently capable of regulating promoter-specific enhancement exist in the combinatorically complex PTM regulatory scheme, and offer the simplest starting point for understanding how PTMs affect DNA binding specificity. Of these, eight (S46, K320, R333, R335, R337, K370, K373, K382) are located in the IDR and thus can provide information on the allosteric nature of the regulation. From this list, we have selected to showcase information on S46 and K320 to represent both a phosphorylated and an acetylated residue, and to include a residue in each of the disordered regions. A broad literature on PTMs exists; within the scope of this review, we will glean general ideas from these well-defined examples.
Acetylation of the C-terminal K320 (Ac-K320) is associated with activation of the high-affinity p53 target sequence p21/WAF that mediates cell cycle arrest. The acetylation occurs via a cascade in which the kinase HIPK2 interacts with PCAF, a histone acetyltransferase (HAT) also capable of modifying p53, to increase its acetylation activity of p53 at K320 (Di Stefano et al. 2005). p53 has been shown to bind in the p21 promoter to two distinct upstream binding sites (− 2.27 kb termed the 5′ site and − 1.38 kb termed the 3′ site) by DNaseI chromatin footprinting (Espinosa and Emerson 2001). Ac-K320 p53 binding specifically with this promoter’s sequence up-regulates the expression of the p21 gene product (Di Stefano et al. 2005; Knights et al. 2006).
Phosphorylation of the N-terminal S46 (P-S46) is associated with activation of the p53 target gene, apoptosis-inducing protein p53AIP1, and commits the cell to apoptosis (D’Orazi et al. 2002; Oda et al. 2000). In response to ultraviolet irradiation, HIPK2 kinase transfers a phosphate to S46. Under the alternate stress of IR irradiation, the protein ATM activates dual-specificity tyrosine-phosphorylation-regulated kinase 2 (DYRK2) which phosphorylates S46 of p53 (Saito et al. 2002; Taira et al. 2007; Yoshida 2008). P-S46 activates p53 to bind to the promoter of p53AIP1, increasing the production of p53AIP1 protein, which initiates a mitochondrial apoptotic pathway (Matsuda et al. 2002; Oda et al. 2000; Yuan et al. 2012).
PTM modification of p53 N and C terminal regions to control DNA binding specificity is an example of allosteric regulation. Allostery occurs when an effector molecule binds and causes a change in a site in a distant part of the protein (Berezovsky and Nussinov 2022; Cooper and Dryden 1984; Fenton 2008). The phenomenon was first recognized in hemoglobin half a century ago (Monod et al. 1963) and has become so widely recognized in biological processes that it is termed “the second secret of life” (Fenton 2008). Over the years, the definition has broadened to include phenomena such as covalent allosteric effectors; PTMs constitute an important example of this type. Thus, in the allosteric view of the system, the effector is a PTM, and the active site modification is the changed binding affinity between the protein and its DNA binding sites.
On allosteric drug design
We have at this point considered the overview of p53 structure, which is most clear at the monomeric level, and considerably more controversial at higher order. Biochemical experiments studying the binding sequences of actual p53 sites paint a picture of a protein that readily recognizes its consensus sequence which binds as shown in the original 1TUP structure, but the situation quickly becomes more complex when considering the myriad of biological DNA sequences binding well and apparently in a discriminatory way that do not properly follow the consensus. From the bioinformatic point of view, the problem is considerably underdetermined, with at best three, or perhaps only two, of the contacts conferring specificity to the site. The idea that binding may occur by indirect readout taking into account the structure and dynamics of the DNA itself thus becomes an interesting proposition. Considering the variations that may occur due to post-translational modifications and various isoforms of p53 raises the idea of even more possibilities for factors controlling the p53-DNA interaction, which are very much in the stages of still being sorted out. Nevertheless, we emerge with the idea that p53 in all these cases binds DNA at the original interface. The Y220C rescue provides the case in point that molecular rescue is not only plausible, but also possible despite the complexities.
Encouraged by the Y220C success, we now turn our attention to a critical problem raised at the outset, namely, how to tackle designing drugs to restore p53 native activity. The serendipitous discovery of allosteric regulators such as the p53 restorative compound PK11000 has opened the idea of de novo design of allosteric regulators of proteins in which the active site is controlled from a distance. These will have a unique role in situations where one wishes to modulate the active site activity while leaving it fully intact. One apt application of this idea is in the reactivation of mutant proteins.
Y220C rescue by PK11000 operates on the principles of restoring a reasonable number of H-bonds at the active site (Han et al. 2022). This is of course much easier to arrive at post priori with the solution in hand. The very fact that it was not designed from first principles leaves researchers at a loss for how to improve upon an excellent idea such that more iterations could be attempted with reasonably good guesses. Since the active site constitutes the protein-DNA interface, we are faced with the difficult problem of designing drugs that act upon this interface remotely. Any researcher hoping to design a drug to restore WT function must reverse engineer allosteric interactions, a difficult task indeed. While humans have limited success at this, machines are considerably better when a sufficient dataset and proper training are available, and when one has a specific outcome or goal that can be quantitated in a way that can be presented to a machine. In the second half of the review, we report on ideas for the identification of allosteric sites of control and the means by which machines may be implemented to identify patterns to optimize through making moves equating to manipulating branches of graphs representing molecules. We propose exciting advances and ideas that, although have not yet realized the molecules needed, provide a way forward in directions that have heretofore not been possible.
Machine learning in allosteric drug design
Generative adversarial neural networks (GANNs) (Aggarwal et al. 2021; Arora et al. 2023; Creswell et al. 2018) represent a powerful framework for unsupervised learning and generative modeling in machine learning. Comprising two distinct neural networks, a generator and a discriminator, GANNs operate through an adversarial training process. The generator network aims to generate synthetic samples, in this case a drug molecule, that resembles real data, while the discriminator network learns to differentiate between real and synthetic samples. Through an iterative training process, the generator seeks to improve its ability to generate realistic samples, while the discriminator simultaneously enhances its discriminative capability. Using a graph representation of the drug, a move for improvement of the drug equates to a change in the graph subject to rules of basic chemistry. This competitive interplay fosters the refinement of both networks, leading to the generation of increasingly realistic and high-quality synthetic samples. GANNs have demonstrated remarkable success in various domains, such as image synthesis, text generation, and audio generation, making them a widely researched and influential methodology in contemporary machine learning research.
GANNs have emerged as a promising tool in the field of drug design and drug discovery (Abbasi et al. 2022; Bian and Xie 2021; Padalkar et al. 2021; Tripathi et al. 2022). They offer a unique approach to generating novel molecules with desired properties by combining the power of deep learning and generative modeling (Bian and Xie 2021). The generator network in GANNs can be trained to generate new chemical structures, while the discriminator network assesses the quality and desirability of these generated molecules based on various criteria, such as drug-likeness (Guan et al. 2019), bioactivity (Gaulton et al. 2012; Lane et al. 2021), or synthetic feasibility (Gao and Coley 2020; Liu et al. 2022; Thakkar et al. 2021). The cooperative training between the generator and discriminator enables GANNs to learn and capture the underlying chemical space, leading to the generation of novel molecules that possess desirable drug-like properties. GANNs hold great potential in accelerating the process of lead optimization, hit identification, and de novo drug design by facilitating the exploration of vast chemical spaces and enabling the discovery of novel therapeutic candidates with improved efficacy and safety profiles (Abbasi et al. 2022; Padalkar et al. 2021). Despite the challenges in optimizing the balance between novelty and drug-likeness, GANs offer a promising avenue for the discovery of new drugs and the design of targeted therapeutics in an efficient and data-driven manner.
Molecular docking (Bai et al. 2023; Gschwend et al. 1996; Meng et al. 2011; Pinzi and Rastelli 2019) is a computational technique in which small molecules are placed and optimized in a binding pocket in a larger molecule, most often a protein. Autodock Vina (Trott and Olson 2009) is an open-source docking implementation maintained by the Forli lab at The Scripps Institute. It is one of the fastest and most widely used open-source docking engines. It uses a simple scoring function, contributing to its speed and ease of use. Flexible docking, which enables the user to specify a subset of the system to undergo coordinate optimization during the docking process in response to the inherent force field, yields the most precise results. This, however, comes at a steep computational cost. Autodock Vina takes advantage of parallelization of calculations as well as the ability to process jobs on graphical processing units (GPUs) (Gawehn et al. 2018; Pandey et al. 2022), enabling speedups of orders of magnitude over previous generations of docking software such as Autodock 4. Thus, researchers can now access high-precision docking in a shorter amount of (wallclock) time, or utilize the speedup to increase the number of compounds screened computationally.
Experimental screening of compounds, known as high-throughput screening (Liu et al. 2022; White 2000), has been the standard for drug discovery of biologically active hits. The high monetary cost of such screens is prohibitive, leading to the adoption of aforementioned in silico screening methods, of which docking is one of the most prominent. While docking screens are most often used to screen a pre-existing virtual library of compounds, it can just as readily be used for screening and assessing compounds derived from generative AI approaches. Computational approaches using virtual libraries are an attractive way to screen an expanded chemical space. Presently, enumerating tens of billions of molecules is within the realm of possibility. They, however, are unlikely to exist from natural sources and thus require synthesis. Few can actually be made due to constraints of time, storage capacity, and cost of materials. Docking, however, can be instrumental in providing a rank order to indicate which are most promising (Shen et al. 2023). The two major concerns regarding docking are that the molecules may not be readily synthesizable, and accuracy of docking scores historically has not been sufficiently accurate to identify true hits, mostly due to false positives which are magnified especially in such a large space. However, recent advances have mitigated these problems (Cavasotto and Di Filippo 2023). While the approach remains imperfect, it has made this option viable for many labs such that it is of practical use. The first is the number of “make-on-demand” libraries (Schmidt et al. 2022) offered by vendors and academic laboratories. Users may request a molecule of interest and a dedicated team will then synthesize it based on 2- or 3-component reactions and purify the compound. Making use of the power of combinatorics, the company Enamine, for example, offers some 140 reactions among 120,000 molecular building blocks to offer the ability to synthesize a library spanning a space of 29 billion molecules, greatly increasing the probability that a desired molecule of interest can be created on demand for a reasonable price.
While molecular docking continues to deal with known imperfections, it has proven to be successfully implemented to reasonably prioritize molecules worthy of subsequent study even in these newly accessible ultra-large libraries even in the tens of billions of size range. Integrating docking scores into the generator’s scoring function is a simple yet powerful way to improve the quality of results generated. No longer are the molecules generated in a vacuum, but they are generated and iteratively improved within the specific context of p53 and its DNA interactions.
Allosteric p53 and allosteric drug design
To begin this portion, we foray into the topic of allostery and its role in drug design. Allosteric regulation of proteins is a strategy to finely modulate protein activity where the active site is fully intact and able to perform function (Guarnera and Berezovsky 2020; Peng 2015). Allostery, or action at a distance, has been widely regarded as so pervasively ubiquitous to biology that it has been dubbed “the second secret of life,” second only to the DNA genetic code itself (Fenton 2008). Such a signal can be transmitted even up to 100 angstroms away (Jayaraj et al. 2023; Lakhani et al. 2017). The p53 protein has been identified as an allosteric protein (Degn et al. 2022), and the hotspot mutation Y220C that has been implicated in allosteric drug design appears far from the DNA binding interface (Bauer et al. 2016). Allosteric drugs operate orthosterically by binding distant from the active site, enabling it to modulate activity remotely rather than occlude activity (Nussinov and Tsai 2012). Allosteric drugs may be more highly selective to their target because allosteric points of regulation are likely free from evolutionary selection, affording variation even among proteins in the same family (Lakhani et al. 2017). Allosteric effectors also often modulate the level of activity as opposed to completely turning activity on or off (Guarnera and Berezovsky 2016). This allows for safer dosage, especially for targets carrying out essential functions, and may make drugs more safe for accidental overdose or ingestion. Furthermore, because the allosteric site differs from the active site, the drug does not compete with the native ligand, and therefore lower dosages may be effective (Grover 2013). Furthermore, allosteric drugs open possibilities for drugging previously undruggable targets (Hantschel et al. 2011; Hassin and Oren 2023). While drugging an active site provides only a single binding site, many allosteric points of control may be available, offering many new inroads to successfully drugging these targets.
Although allosteric drugs offer numerous advantages, they continue to be discovered most often by compound screens, which are costly and require additional experimentation to determine how and where they bind (Chatzigoulas and Cournia 2021; Guarnera and Berezovsky 2020). Additionally, an assay for the binding of the allosteric effector and a functional assay will be required. Even at the cost of $1 per drug, likely a gross underestimation, a typical screen involving 1 million compounds would cost $1 million, and if anything is even found, there is little to no mechanistic knowledge available without further experimentation for refinement as is often required as testing in animal models and clinical trials often reveal. Thus, computational approaches for computer-aided drug design (CADD) are an indispensable strategy for researchers both in the industrial and academic setting (Durrant and McCammon 2010). CADD approaches have enjoyed wide success in development of drugs targeting active sites and other direct points on proteins (Hernández Alvarez et al. 2019; Irsheid et al. 2019; Kim et al. 2017; Ogunlana et al. 2022; Wang et al. 2022).
Sectors: identifying allosteric points of control for drug design
One formidable challenge to allosteric drug design is to identify locations of allosteric points of control. Advances have been made to identify allosteric points of control (Lakhani et al. 2020), a foremost model having been based on the idea of sector analysis. The “sector hypothesis” puts forth the idea that sector residues convey allosteric information between the site of the allosteric binding and the active site. The idea originated with making use of the sector analysis of the stock market arena to identify residues illustrating covariance in proteins over evolutionary time. Sectors convey the idea that groups of covarying entities may have some predictive capabilities for a larger trend, one of which may be allosteric signaling, an idea that was supported by early sector implementation with experimental verification (Lakhani et al. 2020; Reynolds et al. 2013, 2011).
The next generation of sectors was built on the idea of identifying residues of a single protein that covary due to thermal fluctuations in molecular dynamics simulations, called MD sectors. Thus, the covariance from multiple sequence alignments can be exchanged for covariance from MD simulation trajectories, removing the need to work with proteins from many species. It also provides a handle to engage with statistical mechanics and population dynamics by dividing the snapshots of a well equilibrated system into various clusters and following their interchange, alongside the identification of the sector residues. The MD sectors lend key insights into allosteric signaling, a perturbation at one locale that produces an effect at a distal site (Lakhani et al. 2020, 2017). The definition of allosteric may be aggrandized to encompass distant point mutations affecting a binding site and the binding of a small therapeutic molecule. Thus, MD sectors provide a quantifiable method to analyze allosteric signals which enables the engineering of allosteric regulators to effect a desired outcome in tweaking the activity of an active site from a distance. Analyzing MD enables the ability to interpret not only the covariance of individual residues, but also grouping the highest 20% of covarying residues into sectors.
MD sectors have recently been computed on p53 with an eye towards understanding how point mutations might propagate through the protein to affect the active site (Fabry and Thayer 2023). Recent iterations of MD sectors display that residues analyzed from MD trajectories can be quantified into a sector. In the case of research into the p53 tumor suppression protein, the ability to quantify sectors and visualize how allosteric signals correspond to one another becomes a vital area of study. In the latest study and development of new MD sector algorithms, the potential for quantifying multiple sectors within one protein has become a main subject of research. One sector displays a region of allosteric signals within a protein; however, the introduction of dividing a single protein into a network of multiple sectors, each hosting their own system of correlated residues, provides the potential for a continued deeper analysis of well-studied and newer unresearched proteins alike.
In addition to using MD sectors to model allosteric networks, recent work suggests that allosteric networks can also be captured well by a new method called heat kernel analysis (HCA) (Avramidi 1999). The method borrows the idea of flow from physics; instead of following the flow of heat in coupled bodies, we instead use the framework to study the flow of the allosteric signal through protein residues in a graph-theoretic framework. Kernels are mathematical operators that perform a transformation to maximize the separation of some signal of interest in a latent space. We combine HCA with deep generative adversarial neural networks to generate small molecules. Machines modify docked molecules to restore native p53 conformation dynamics profiles from molecular dynamics simulations.
The heat kernel itself is a fundamental solution to the heat equation, representing the distribution of heat at any given point in space and time. The solution to the heat kernel for a given time t is given by the equation:
where P is the matrix of eigenvectors of the normalized graph Laplacian, and e−tL is the diagonal matrix with e raised to the power of each eigenvalue of the graph Laplacian multiplied by t. Analyzing the heat kernel using eigendecomposition, we can gain insights into various aspects of diffusion processes. Eigendecomposition is the factorization of a matrix into a canonical form, whereby a diagonalizable matrix is represented in terms of its eigenvalues and eigenvectors. The heat kernel provides information about the probability density function of the diffusion process, revealing how the distribution of heat changes over time and space. In this context, “heat” represents the allosteric signal diffusing over the protein as captured by the interaction energies from MD simulations, of which electrostatic has emerged as the most significant.
Given that any drug that is able to rescue mutated p53 will almost certainly be allosteric, identifying what the allosteric network is becomes the first-order problem. This approach of using MD simulations and a combination of sector analysis and HCA to elucidate the allosteric sectors within the protein offers invaluable insight into how various mutations and drugs alter the allosteric network, and what combinations of mutations and drugs offer WT rescue (Abramson 2021; Cowan 2023).
Drug design summary
The above two sections thus define a straightforward (and admittedly idealistic) drug development pipeline. Drugs are generated using a generative machine-learning network, which is constantly assessing generated molecules on their docking abilities with p53 using Autodock Vina. Then, simulations are run with the drug bound to the mutant form of the protein, and using MD sector analysis and heat kernel analysis, the allosterically connected residues are identified and compared to those of the wild-type. The process then repeats, until a sufficiently well-performing drug is generated.
Conclusion
We have seen that the p53-DNA interactions are complex and nuanced; it can bind by virtue of a consensus binding site, or something far more complex involving indirect readout of the dynamical properties of the DNA itself. Furthermore, tetramerization, post-translational modifications, and the presence of isoforms all contribute to the life or death decisions p53 is involved in.
Turning then to prospects of molecular cures for cancers, we review the progress in light of the many new advances available to expand the possibilities beyond the status quo, which largely rely upon educated chance. Recognizing that both mutations and small molecules may act from a distance, drawing upon knowledge of allostery greatly informs new approaches. Sector analysis may assist researchers in identifying allosteric residues. While rescue does not guarantee the recapitulating the binding interface, recent work focused on allosteric sectors demonstrates a promising role in explaining how the serendipitously discovered PK11000 works, alongside the potential to analyze other drugs focused on p53. Although PK11000 itself has not been found suitable for human use, new insights provide ideas as to how the experimental drug could have been engineered, so that we may design other trials de novo.
Furthermore, the new possibility of designing drugs in silico opens new prospects. Potential small molecule therapeutics can be represented utilizing graph theory and machine learning techniques such as general adversarial neural networks. Such methods may be able to utilize feedback from molecular simulations, energetics, or other related data to modify those graphs, thereby engineering allosteric molecules based on feedback through the network. Challenges involve developing and tuning pipelines to provide feedback from MD simulations, allostery, the H-bond interface, DNA binding interface, sectors, and any other computational metric in a feedback loop to improve drugs. While utilizing computational techniques to their fullest extent has not yet been realized, the method holds great promise for advances in the development of molecular therapeutics. Furthermore, computational methods are often translational, with potential applications to other currently undruggable targets. Such means may also have applications for engineering molecules of a desired function.
In conclusion, p53 is a complex, fascinating, and timely molecule to study. Some 3 decades after its structural discovery, we have yet to find molecular cures to restore its mutants to native function. Combining biophysical understanding with machine learning, new horizons are now opening on approaches with potential to shed greater light on its biomolecular function while making an impact on finding molecular cures for cancer.
Data availability
Not applicable. This article is a review article.
References
Abbasi M, Santos BP, Pereira TC, Sofia R, Monteiro NRC, Simões CJV, Brito RMM, Ribeiro B, Oliveira JL, Arrais JP (2022) Designing optimized drug candidates with generative adversarial network. J Cheminform 14:40. https://doi.org/10.1186/s13321-022-00623-6
Abramson D (2021) The colors of a protein: protein dynamics through the lens of spectral graph theory. Wesleyan Universit, Middletown, CT
Adámik M, Kejnovská I, Bažantová P, Petr M, Renčiuk D, Vorlíčková M, Brázdová M (2016) p53 binds human telomeric G-quadruplex in vitro. Biochimie 128–129:83–91. https://doi.org/10.1016/j.biochi.2016.07.004
Aggarwal A, Mittal M, Battineni G (2021) Generative adversarial network: an overview of theory and applications. Int J Inform Manage Data Insights 1:100004. https://doi.org/10.1016/j.jjimei.2020.100004
Appella E, Anderson CW (2001) Post-translational modifications and activation of p53 by genotoxic stresses. Eur J Biochem 268:2764–2772. https://doi.org/10.1046/j.1432-1327.2001.02225.x
Armour-Garb I, Han ISM, Cowan BS, Thayer KM (2022) Variable regions of p53 isoforms allosterically hard code DNA interaction. J Phys Chem B 126:8495–8507. https://doi.org/10.1021/acs.jpcb.2c06229
Arora J, Tushir M, Kherwa P, Rathee S (2023) Generative adversarial networks: a comprehensive review. Data Wrangling: Concepts, Applications and Tools 213. https://doi.org/10.1002/9781119879862.ch10
Avramidi IG (1999) Covariant techniques for computation of the heat kernel. Rev Math Phys 11:947–980. https://doi.org/10.1142/S0129055X99000295
Ayed A, Mulder FA, Yi GS, Lu Y, Kay LE, Arrowsmith CH (2001) Latent and active p53 are identical in conformation. Nat Struct Biol 8:756–760. https://doi.org/10.1038/nsb0901-756
Bai G, Pan Y, Zhang Y, Li Y, Wang J, Wang Y, Teng W, Jin G, Geng F, Cao J (2023) Research advances of molecular docking and molecular dynamic simulation in recognizing interaction between muscle proteins and exogenous additives. Food Chem 429:136836. https://doi.org/10.1016/j.foodchem.2023.136836
Baptiste N, Friedlander P, Chen X, Prives C (2002) The proline-rich domain of p53 is required for cooperation with anti-neoplastic agents to promote apoptosis of tumor cells. Oncogene 21:9–21. https://doi.org/10.1038/sj.onc.1205015
Bauer MR, Joerger AC, Fersht AR (2016) 2-Sulfonylpyrimidines: mild alkylating agents with anticancer activity toward p53-compromised cells. Proc Natl Acad Sci 113:E5271–E5280. https://doi.org/10.1073/pnas.1610421113
Berezovsky IN, Nussinov R (2022) Multiscale allostery: basic mechanisms and versatility in diagnostics and drug design. J Mol Biol 434:167751. https://doi.org/10.1016/j.jmb.2022.167751
Bertrand H-O, Fermandjian S, Ha-Duong T, Hartmann B (1998) Flexibility of the B-DNA backbone: effects of local and neighbouring sequences on pyrimidine-purine steps. Nucleic Acids Res 26:1261–1267. https://doi.org/10.1093/nar/26.5.1261
Beveridge DL, Dixit SB, Barreiro G, Thayer KM (2004) Molecular dynamics simulations of DNA curvature and flexibility: helix phasing and premelting. Biopolymers 73:380–403. https://doi.org/10.1002/bip.20019
Beveridge DL, Cheatham TE, Mezei M (2012) The ABCs of molecular dynamics simulations on B-DNA, circa 2012. J Biosci 37:379–397. https://doi.org/10.1007/s12038-012-9222-6
Bian Y, Xie X-Q (2021) Generative chemistry: drug discovery with deep learning generative models. J Mol Model 27:71. https://doi.org/10.1007/s00894-021-04674-8
Blanden AR, Yu X, Wolfe AJ, Gilleran JA, Augeri DJ, O’Dell RS, Olson EC, Kimball SD, Emge TJ, Movileanu L, Carpizo DR, Loh SN (2015) Synthetic metallochaperone ZMC1 rescues mutant p53 conformation by transporting zinc into cells as an ionophore. Mol Pharmacol 87:825–831. https://doi.org/10.1124/mol.114.097550
Bourdon J-C (2007) p53 and its isoforms in cancer. Br J Cancer 97:277–282. https://doi.org/10.1038/sj.bjc.6603886
Bourdon J-C, Fernandes K, Murray-Zmijewski F, Liu G, Diot A, Xirodimas DP, Saville MK, Lane DP (2005) p53 isoforms can regulate p53 transcriptional activity. Genes Dev 19:2122–2137. https://doi.org/10.1101/gad.1339905
Brázda V, Fojta M (2019) The rich world of p53 DNA binding targets: the role of DNA structure. Int J Mol Sci 20:5605. https://doi.org/10.3390/ijms20225605
Brázda V, Kolomazník J, Lýsek J, Hároníková L, Coufal J, Št’astný J (2016) Palindrome analyser – a new web-based server for predicting and evaluating inverted repeats in nucleotide sequences. Biochem Biophys Res Commun 478:1739–1745. https://doi.org/10.1016/j.bbrc.2016.09.015
Butler JS, Loh SN (2003) Structure, function, and aggregation of the zinc-free form of the p53 DNA binding domain. Biochemistry 42:2396–2403. https://doi.org/10.1021/bi026635n
Butler JS, Loh SN (2006) Folding and misfolding mechanisms of the p53 DNA binding domain at physiological temperature. Protein Sci 15:2457–2465. https://doi.org/10.1110/ps.062324206
Bykov VJN, Eriksson SE, Bianchi J, Wiman KG (2018) Targeting mutant p53 for efficient cancer therapy. Nat Rev Cancer 18:89–102. https://doi.org/10.1038/nrc.2017.109
Cavasotto CN, Di Filippo JI (2023) The impact of supervised learning methods in ultralarge high-throughput docking. J Chem Inf Model 63:2267–2280. https://doi.org/10.1021/acs.jcim.2c01471
Chatzigoulas A, Cournia Z (2021) Rational design of allosteric modulators: challenges and successes. Wiley Interdiscip Rev: Comput Mol Sci 11:e1529. https://doi.org/10.1002/wcms.1529
Chebaro Y, Ballard AJ, Chakraborty D, Wales DJ (2015) Intrinsically Disordered Energy Landscapes. Sci Rep 5:10386. https://doi.org/10.1038/srep10386
Chène P (2001) The role of tetramerization in p53 function. Oncogene 20:2611–2617. https://doi.org/10.1038/sj.onc.1204373
Cho Y, Gorina S, Jeffrey PD, Pavletich NP (1994) Crystal structure of a p53 tumor suppressor-DNA complex: understanding tumorigenic mutations. Science 265:346–355. https://doi.org/10.1126/science.8023157
Cooper A, Dryden DT (1984) Allostery without conformational change. A Plausible Model Eur Biophys J 11:103–109. https://doi.org/10.1007/BF00276625
Corless S, Gilbert N (2016) Effects of DNA supercoiling on chromatin architecture. Biophys Rev 8:245–258. https://doi.org/10.1007/s12551-016-0210-1
Cowan BS (2023) Optimizing allosteric analysis: a Wasserstein distance and heat kernel-based methodology for investigating p53 energetics. Wesleyan Univesity, middletown, CT
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA (2018) Generative adversarial networks: an overview. IEEE Signal Process Mag 35:53–65. https://doi.org/10.1109/MSP.2017.2765202
D’Orazi G, Cecchinelli B, Bruno T, Manni I, Higashimoto Y, Saito S, Gostissa M, Coen S, Marchetti A, Del Sal G, Piaggio G, Fanciulli M, Appella E, Soddu S (2002) Homeodomain-interacting protein kinase-2 phosphorylates p53 at Ser 46 and mediates apoptosis. Nat Cell Biol 4:11–19. https://doi.org/10.1038/ncb714
Dans PD, Balaceanu A, Pasi M, Patelli AS, Petkevičiūtė D, Walther J, Hospital A, Bayarri G, Lavery R, Maddocks JH, Orozco M (2019) The static and dynamic structural heterogeneities of B-DNA: extending Calladine-Dickerson rules. Nucleic Acids Res 47:11090–11102. https://doi.org/10.1093/nar/gkz905
Degn K, Beltrame L, Dahl Hede F, Sora V, Nicolaci V, Vabistsevits M, Schmiegelow K, Wadt K, Tiberti M, Lambrughi M, Papaleo E (2022) Cancer-related mutations with local or long-range effects on an allosteric loop of p53. J Mol Biol 434:167663. https://doi.org/10.1016/j.jmb.2022.167663
Di Stefano V, Soddu S, Sacchi A, D’Orazi G (2005) HIPK2 contributes to PCAF-mediated p53 acetylation and selective transactivation of p21Waf1 after nonapoptotic DNA damage. Oncogene 24:5431–5442. https://doi.org/10.1038/sj.onc.1208717
Dixit SB, Beveridge DL, Case DA, Cheatham TE, Giudice E, Lankas F, Lavery R, Maddocks JH, Osman R, Sklenar H, Thayer KM, Varnai P (2005) Molecular dynamics simulations of the 136 unique tetranucleotide sequences of DNA oligonucleotides. II: Sequence context effects on the dynamical structures of the 10 unique dinucleotide steps. Biophys J 89:3721–3740. https://doi.org/10.1529/biophysj.105.067397
Durrant JD, McCammon JA (2010) Computer-aided drug-discovery techniques that account for receptor flexibility. Curr Opin Pharmacol 10:770–774. https://doi.org/10.1016/j.coph.2010.09.001
El-Deiry WS, Kern SE, Pietenpol JA, Kinzler KW, Vogelstein B (1992) Definition of a consensus binding site for p53. Nat Genet 1:45–49. https://doi.org/10.1038/ng0492-45
Emamzadah S, Tropia L, Vincenti I, Falquet B, Halazonetis TD (2014) Reversal of the DNA-binding-induced loop L1 conformational switch in an engineered human p53 protein. J Mol Biol 426:936–944. https://doi.org/10.1016/j.jmb.2013.12.020
Espinosa JM, Emerson BM (2001) Transcriptional regulation by p53 through intrinsic DNA/chromatin binding and site-directed cofactor recruitment. Mol Cell 8:57–69. https://doi.org/10.1016/S1097-2765(01)00283-0
Fabry JD, Thayer KM (2023) Network analysis of molecular dynamics sectors in the p53 protein. ACS Omega 8:571–587. https://doi.org/10.1021/acsomega.2c05635
Fenton AW (2008) Allostery: an illustrated definition for the ‘second secret of life.’ Trends Biochem Sci 33:420–425. https://doi.org/10.1016/j.tibs.2008.05.009
Gao W, Coley CW (2020) The synthesizability of molecules proposed by generative models. J Chem Inf Model 60:5714–5723. https://doi.org/10.1021/acs.jcim.0c00174
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100–D1107. https://doi.org/10.1093/nar/gkr777
Gawehn E, Hiss JA, Brown JB, Schneider G (2018) Advancing drug discovery via GPU-based deep learning. Expert Opin Drug Discov 13:579–582. https://doi.org/10.1080/17460441.2018.1465407
Grover AK (2013) Use of allosteric targets in the discovery of safer drugs. Med Princ Pract 22:418–426. https://doi.org/10.1159/000350417
Gschwend DA, Good AC, Kuntz ID (1996) Molecular docking towards drug discovery. J Mol Recognit 9:175–186. https://doi.org/10.1002/(SICI)1099-1352(199603)9:2%3c175::AID-JMR260%3e3.0.CO;2-D
Gu B, Zhu W-G (2012) Surf the post-translational modification network of p53 regulation. Int J Biol Sci 8:672–684. https://doi.org/10.7150/ijbs.4283
Guan L, Yang H, Cai Y, Sun L, Di P, Li W, Liu G, Tang Y (2019) ADMET-score–a comprehensive scoring function for evaluation of chemical drug-likeness. Medchemcomm 10:148–157. https://doi.org/10.1039/c8md00472b
Guarnera E, Berezovsky IN (2016) Allosteric sites: remote control in regulation of protein activity. Curr Opin Struct Biol 37:1–8. https://doi.org/10.1016/j.sbi.2015.10.004
Guarnera E, Berezovsky IN (2020) Allosteric drugs and mutations: chances, challenges, and necessity. Curr Opin Struct Biol 62:149–157. https://doi.org/10.1016/j.sbi.2020.01.010
Hamard P-J, Lukin DJ, Manfredi JJ (2012) p53 basic C terminus regulates p53 functions through DNA binding modulation of subset of target genes. J Biol Chem 287:22397–22407. https://doi.org/10.1074/jbc.M111.331298
Hamard P-J, Barthelery N, Hogstad B, Mungamuri SK, Tonnessen CA, Carvajal LA, Senturk E, Gillespie V, Aaronson SA, Merad M, Manfredi JJ (2013) The C terminus of p53 regulates gene expression by multiple mechanisms in a target- and tissue-specific manner in vivo. Genes Dev 27:1868–1885. https://doi.org/10.1101/gad.224386.113
Han ISM, Abramson D, Thayer KM (2022) Insights into rational design of a new class of allosteric effectors with molecular dynamics Markov state models and network theory. ACS Omega 7:2831–2841. https://doi.org/10.1021/acsomega.1c05624
Hantschel O, Grebien F, Superti-Furga G (2011) Targeting allosteric regulatory modules in oncoproteins: “drugging the undruggable.” Oncotarget 2:828–829. https://doi.org/10.18632/oncotarget.354
Hassin O, Oren M (2023) Drugging p53 in cancer: one protein, many targets. Nat Rev Drug Discov 22:127–144. https://doi.org/10.1038/s41573-022-00571-8
Hernández Alvarez L, Barreto Gomes DE, Hernandez Gonzalez JE, Pascutti PG (2019) Dissecting a novel allosteric mechanism of cruzain: a computer-aided approach. PLoS ONE 14:e0211227. https://doi.org/10.1371/journal.pone.0211227
Hernandez-Boussard T, Rodriguez-Tome P, Montesano R, Hainaut P (1999) IARC p53 mutation database: a relational database to compile and analyze p53 mutations in human tumors and cell lines. Hum Mutat 14:1–8. https://doi.org/10.1002/(SICI)1098-1004(1999)14:1%3c1::AID-HUMU1%3e3.0.CO;2-H
Ho WC, Fitzgerald MX, Marmorstein R (2006) Structure of the p53 core domain dimer bound to DNA *♦. J Biol Chem 281:20494–20502. https://doi.org/10.1074/jbc.M603634200
Irsheid L, Wehler T, Borek C, Kiefer W, Brenk R, Ortiz-Soto ME, Seibel J, Schirmeister T (2019) Identification of a potential allosteric site of Golgi α-mannosidase II using computer-aided drug design. PLoS ONE 14:e0216132. https://doi.org/10.1371/journal.pone.0216132
Islami F, Ward EM, Sung H, Cronin KA, Tangka FKL, Sherman RL, Zhao J, Anderson RN, Henley SJ, Yabroff KR, Jemal A, Benard VB (2021) Annual Report to the Nation on the Status of Cancer, Part 1: National Cancer Statistics. JNCI J Nat Cancer Inst 113:1648–1669. https://doi.org/10.1093/jnci/djab131
Jagelska EB, Pivonkova H, Fojta M, Brazda V (2010) The potential of the cruciform structure formation as an important factor influencing p53 sequence-specific binding to natural DNA targets. Biochem Biophys Res Commun 391:1409–1414. https://doi.org/10.1016/j.bbrc.2009.12.076
Jayaraj A, Thayer KM, Beveridge DL, Hingorani MM (2023) Molecular dynamics of mismatch detection-how MutS uses indirect readout to find errors in DNA. Biophys J 122:3031–3043. https://doi.org/10.1016/j.bpj.2023.06.006
Jeffrey PD, Gorina S, Pavletich NP (1995) Crystal structure of the tetramerization domain of the p53 tumor suppressor at 1.7 angstroms. Science 267:1498–1502. https://doi.org/10.1126/science.7878469
Joerger AC, Fersht AR (2007) Structure-function-rescue: the diverse nature of common p53 cancer mutants. Oncogene 26:2226–2242. https://doi.org/10.1038/sj.onc.1210291
Joerger AC, Fersht AR (2008) Structural biology of the tumor suppressor p53. Annu Rev Biochem 77:557–582. https://doi.org/10.1146/annurev.biochem.77.060806.091238
Joerger AC, Ang HC, Veprintsev DB, Blair CM, Fersht AR (2005) Structures of p53 cancer mutants and mechanism of rescue by second-site suppressor mutations *. J Biol Chem 280:16030–16037. https://doi.org/10.1074/jbc.M500179200
Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The human genome browser at UCSC. Genome Res 12:996–1006. https://doi.org/10.1101/gr.229102
Kim J, Yang G, Ha J (2017) Targeting of AMP-activated protein kinase: prospects for computer-aided drug design. Expert Opin Drug Discov 12:47–59. https://doi.org/10.1080/17460441.2017.1255194
Kitayner M, Rozenberg H, Kessler N, Rabinovich D, Shaulov L, Haran TE, Shakked Z (2006) Structural basis of DNA recognition by p53 tetramers. Mol Cell 22:741–753. https://doi.org/10.1016/j.molcel.2006.05.015
Kitayner M, Rozenberg H, Rohs R, Suad O, Rabinovich D, Honig B, Shakked Z (2010) Diversity in DNA recognition by p53 revealed by crystal structures with Hoogsteen base pairs. Nat Struct Mol Biol 17:423–429. https://doi.org/10.1038/nsmb.1800
Klein AM, de Queiroz RM, Venkatesh D, Prives C (2021) The roles and regulation of MDM2 and MDMX: it is not just about p53. Genes Dev 35:575–601. https://doi.org/10.1101/gad.347872.120
Knights CD, Catania J, Giovanni SD, Muratoglu S, Perez R, Swartzbeck A, Quong AA, Zhang X, Beerman T, Pestell RG, Avantaggiati ML (2006) Distinct p53 acetylation cassettes differentially influence gene-expression patterns and cell fate. J Cell Biol 173:533–544. https://doi.org/10.1083/jcb.200512059
Lakhani B, Thayer KM, Hingorani MM, Beveridge DL (2017) Evolutionary covariance combined with molecular dynamics predicts a framework for allostery in the MutS DNA mismatch repair protein. J Phys Chem B 121:2049–2061. https://doi.org/10.1021/acs.jpcb.6b11976
Lakhani B, Thayer KM, Black E, Beveridge DL (2020) Spectral analysis of molecular dynamics simulations on PDZ: MD sectors. J Biomol Struct Dyn 38:781–790. https://doi.org/10.1080/07391102.2019.1588169
Lane DP (1992) p53, guardian of the genome. Nature 358:15–16. https://doi.org/10.1038/358015a0
Lane TR, Foil DH, Minerali E, Urbina F, Zorn KM, Ekins S (2021) Bioactivity comparison across multiple machine learning algorithms using over 5000 datasets for drug discovery. Mol Pharmaceutics 18:403–415. https://doi.org/10.1021/acs.molpharmaceut.0c01013
Lavery R, Zakrzewska K, Beveridge DL, Bishop TC, Case DA, Cheatham TEI, Dixit SB, Jayaram B, Lankas F, Laughton C, Maddocks JH, Michon A, Osman R, Orozco M, Perez R, Singh T, Spackova N, Sponer J (2010) A systematic molecular dynamics study of nearest-neighbor effects on base pair and base pair step conformations and fluctuations in B-DNA. Nucleic Acids Res 38:299–313. https://doi.org/10.1093/nar/gkp834
Linzer DIH, Levine AJ (1979) Characterization of a 54K Dalton cellular SV40 tumor antigen present in SV40-transformed cells and uninfected embryonal carcinoma cells. Cell 17:43–52. https://doi.org/10.1016/0092-8674(79)90293-9
Liu C-H, Korablyov M, Jastrzębski S, Włodarczyk-Pruszyński P, Bengio Y, Segler M (2022) RetroGNN: fast estimation of synthesizability for virtual screening and de novo design by learning from slow retrosynthesis software. J Chem Inf Model 62:2293–2300. https://doi.org/10.1021/acs.jcim.1c01476
Lopes EA, Gomes S, Saraiva L, Santos MMM (2019) Small molecules targeting mutant P53: a promising approach for cancer treatment. Curr Med Chem 26:7323–7336. https://doi.org/10.2174/0929867325666181116124308
Lu Q, Tan Y-H, Luo R (2007) Molecular dynamics simulations of p53 DNA-binding domain. J Phys Chem B 111:11538–11545. https://doi.org/10.1021/jp0742261
Lukman S, Lane DP, Verma CS (2013) Mapping the structural and dynamical features of multiple p53 DNA binding domains: insights into loop 1 intrinsic dynamics. PLoS ONE 8:e80221. https://doi.org/10.1371/journal.pone.0080221
Ly E, Kugel JF, Goodrich JA (2020) Single molecule studies reveal that p53 tetramers dynamically bind response elements containing one or two half sites. Sci Rep 10:16176. https://doi.org/10.1038/s41598-020-73234-6
Matsuda K, Yoshida K, Taya Y, Nakamura K, Nakamura Y, Arakawa H (2002) p53AIP1 regulates the mitochondrial apoptotic pathway1. Can Res 62:2883–2889
McLure KG, Lee PWK (1998) How p53 binds DNA as a tetramer. EMBO J 17:3342–3350. https://doi.org/10.1093/emboj/17.12.3342
Melero R, Rajagopalan S, Lázaro M, Joerger AC, Brandt T, Veprintsev DB, Lasso G, Gil D, Scheres SHW, Carazo JM, Fersht AR, Valle M (2011) Electron microscopy studies on the quaternary structure of p53 reveal different binding modes for p53 tetramers in complex with DNA. Proc Natl Acad Sci 108:557–562. https://doi.org/10.1073/pnas.1015520107
Menendez D, Inga A, Resnick MA (2009) The expanding universe of p53 targets. Nat Rev Cancer 9:724–737. https://doi.org/10.1038/nrc2730
Meng X-Y, Zhang H-X, Mezei M, Cui M (2011) Molecular docking: a powerful approach for structure-based drug discovery. Curr Comput Aided Drug Des 7:146–157. https://doi.org/10.2174/157340911795677602
Moll UM, Petrenko O (2003) The MDM2-p53 interaction. Mol Cancer Res 1:1001–1008
Momand J, Wu H-H, Dasgupta G (2000) MDM2 — master regulator of the p53 tumor suppressor protein. Gene 242:15–29. https://doi.org/10.1016/S0378-1119(99)00487-4
Monod J, Changeux J-P, Jacob F (1963) Allosteric proteins and cellular control systems. J Mol Biol 6:306–329. https://doi.org/10.1016/S0022-2836(63)80091-1
Nguyen T-AT, Grimm SA, Bushel PR, Li J, Li Y, Bennett BD, Lavender CA, Ward JM, Fargo DC, Anderson CW, Li L, Resnick MA, Menendez D (2018) Revealing a human p53 universe. Nucleic Acids Res 46:8153–8167. https://doi.org/10.1093/nar/gky720
Nikolova PV, Wong K, DeDecker B, Henckel J, Fersht AR (2000) Mechanism of rescue of common p53 cancer mutations by second-site suppressor mutations. EMBO J 19:370–378. https://doi.org/10.1093/emboj/19.3.370
Novak L, Gibbons A (1999) Hybrid graph theory and network analysis, Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511666391
Nussinov R, Tsai C-J (2012) The different ways through which specificity works in orthosteric and allosteric drugs. Curr Pharm Des 18:1311. https://doi.org/10.2174/138161212799436377
Oda K, Arakawa H, Tanaka T, Matsuda K, Tanikawa C, Mori T, Nishimori H, Tamai K, Tokino T, Nakamura Y, Taya Y (2000) p53AIP1, a potential mediator of p53-dependent apoptosis, and its regulation by Ser-46-phosphorylated p53. Cell 102:849–862. https://doi.org/10.1016/S0092-8674(00)00073-8
Ogunlana AT, Oyedele A-QK, Boyenle ID, Ayoola SO, Ajibare AC, Adeyemi AO, Jinadu LA, Adenrele OT, Alausa AO, Adelusi TI (2022) Computer-aided drug design of some KRAS G12C inhibitors: targeting the covalent and allosteric binding site for cancer therapy. Inform Med Unlocked 32:101032. https://doi.org/10.1016/j.imu.2022.101032
Okorokov AL, Orlova EV (2009) Structural biology of the p53 tumour suppressor. Curr Opin Struct Biol, Theory Simulation / Macromolecular Assemblages 19:197–202. https://doi.org/10.1016/j.sbi.2009.02.003
Okorokov AL, Sherman MB, Plisson C, Grinkevich V, Sigmundsson K, Selivanova G, Milner J, Orlova EV (2006) The structure of p53 tumour suppressor protein reveals the basis for its functional plasticity. EMBO J 25:5191–5200. https://doi.org/10.1038/sj.emboj.7601382
Oldfield CJ, Dunker AK (2014) Intrinsically disordered proteins and intrinsically disordered protein regions. Abby. Rev Biochem 83:553–584. https://doi.org/10.1146/annurev-biochem-072711-164947
Olivier M, Eeles R, Hollstein M, Khan MA, Harris CC, Hainaut P (2002) The IARC TP53 database: new online mutation analysis and recommendations to users. Hum Mutat 19:607–614. https://doi.org/10.1002/humu.10081
Padalkar GR, Patil SD, Hegadi MM, Jaybhaye NK (2021) Drug discovery using generative adversarial network with reinforcement learning, in: 2021 International Conference on Computer Communication and Informatics (ICCCI). IEEE, pp. 1–3
Pandey M, Fernandez M, Gentile F, Isayev O, Tropsha A, Stern AC, Cherkasov A (2022) The transformational role of GPU computing and deep learning in drug discovery. Nat Mach Intell 4:211–221. https://doi.org/10.1038/s42256-022-00463-x
Pasi M, Maddocks JH, Beveridge D, Bishop TC, Case DA, Cheatham T III, Dans PD, Jayaram B, Lankas F, Laughton C, Mitchell J, Osman R, Orozco M, Pérez A, Petkevičiūtė D, Spackova N, Sponer J, Zakrzewska K, Lavery R (2014) μABC: a systematic microsecond molecular dynamics study of tetranucleotide sequence effects in B-DNA. Nucleic Acids Res 42:12272–12283. https://doi.org/10.1093/nar/gku855
Peng JW (2015) Investigating dynamic interdomain allostery in Pin1. Biophys Rev 7:239–249. https://doi.org/10.1007/s12551-015-0171-9
Petty TJ, Emamzadah S, Costantino L, Petkova I, Stavridi ES, Saven JG, Vauthey E, Halazonetis TD (2011) An induced fit mechanism regulates p53 DNA binding kinetics to confer sequence specificity. EMBO J 30:2167–2176. https://doi.org/10.1038/emboj.2011.127
Pinzi L, Rastelli G (2019) Molecular docking: shifting paradigms in drug discovery. Int J Mol Sci 20:4331. https://doi.org/10.3390/ijms20184331
Prathik A, Uma K, Anuradha J (2016) An overview of application of graph theory. Int J ChemTech Res 9:242–248
Qian H, Wang T, Naumovski L, Lopez CD, Brachmann RK (2002) Groups of p53 target genes involved in specific p53 downstream effects cluster into different classes of DNA binding sites. Oncogene 21:7901–7911. https://doi.org/10.1038/sj.onc.1205974
Raney BJ, Dreszer TR, Barber GP, Clawson H, Fujita PA, Wang T, Nguyen N, Paten B, Zweig AS, Karolchik D, Kent WJ (2014) Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics 30:1003–1005. https://doi.org/10.1093/bioinformatics/btt637
Ravichandran S, Subramani VK, Kim KK (2019) Z-DNA in the genome: from structure to disease. Biophys Rev 11:383–387. https://doi.org/10.1007/s12551-019-00534-1
Reynolds KA, McLaughlin RN, Ranganathan R (2011) Hot spots for allosteric regulation on protein surfaces. Cell 147:1564–1575. https://doi.org/10.1016/j.cell.2011.10.049
Reynolds KA, Russ WP, Socolich M, Ranganathan R (2013) Evolution-based design of proteins, in: Methods in enzymology. Elsevier, pp. 213–235. https://doi.org/10.1016/B978-0-12-394292-0.00010-2
Safieh J, Chazan A, Vyas P, Saleem H, Danin-Poleg Y, Ron D, Haran TE (2021) DNA dynamics dictates p53 functional outcome. https://doi.org/10.1101/2021.09.18.460898
Saito S, Goodarzi AA, Higashimoto Y, Noda Y, Lees-Miller SP, Appella E, Anderson CW (2002) ATM mediates phosphorylation at multiple p53 sites, including Ser46, in response to ionizing radiation∗. J Biol Chem 277:12491–12494. https://doi.org/10.1074/jbc.C200093200
Schmidt R, Klein R, Rarey M (2022) Maximum common substructure searching in combinatorial make-on-demand compound spaces. J Chem Inf Model 62:2133–2150. https://doi.org/10.1021/acs.jcim.1c00640
Schultz SC, Shields GC, Steitz TA (1991) Crystal structure of a CAP-DNA complex: the DNA is bent by 90 degrees. Science 253:1001–1007. https://doi.org/10.1126/science.1653449
Shen C, Zhang X, Hsieh C-Y, Deng Y, Wang D, Xu L, Wu J, Li D, Kang Y, Hou T (2023) A generalized protein–ligand scoring framework with balanced scoring, docking, ranking and screening powers. Chem Sci 14:8129–8146. https://doi.org/10.1039/d3sc02044d
Slaw BS (2015) Towards a unified theory of induced-fit binding: characterization of the readout mechanism and the L1 loop’s role in the p5e tumor suppressor protein binding event via molecular dynamics simulations. Vassar College
Smeenk L, van Heeringen SJ, Koeppel M, Gilbert B, Janssen-Megens E, Stunnenberg HG, Lohrum M (2011) Role of p53 serine 46 in p53 target gene regulation. PLoS ONE 6:e17574. https://doi.org/10.1371/journal.pone.0017574
Stahlecker J, Klett T, Schwer M, Jaag S, Dammann M, Ernst LN, Braun MB, Zimmermann MO, Kramer M, Lämmerhofer M, Stehle T, Coles M, Boeckler FM (2022) Revisiting a challenging p53 binding site: a diversity-optimized HEFLib reveals diverse binding modes in T-p53C-Y220C. RSC Med Chem 13:1575–1586. https://doi.org/10.1039/D2MD00246A
Steffens Reinhardt L, Groen K, Newton C, Avery-Kiejda KA (2023) The role of truncated p53 isoforms in the DNA damage response. Biochimica et Biophysica Acta (BBA) - Rev Cancer 1878:188882. https://doi.org/10.1016/j.bbcan.2023.188882
Štros M, Muselíková-Polanská E, Pospíšilová Š, Strauss F (2004) High-affinity binding of tumor-suppressor protein p53 and HMGB1 to hemicatenated DNA loops. Biochemistry 43:7215–7225. https://doi.org/10.1021/bi049928k
Suad O, Rozenberg H, Brosh R, Diskin-Posner Y, Kessler N, Shimon LJW, Frolow F, Liran A, Rotter V, Shakked Z (2009) Structural basis of restoring sequence-specific DNA binding and transactivation to mutant p53 by suppressor mutations. J Mol Biol 385:249–265. https://doi.org/10.1016/j.jmb.2008.10.063
Sullivan KD, Gallant-Behm CL, Henry RE, Fraikin J-L, Espinosa JM (2012) The p53 circuit board. Biochimica et Biophysica Acta (BBA) - Rev Cancer 1825:229–244. https://doi.org/10.1016/j.bbcan.2012.01.004
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F (2021) Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J Clin 71:3. https://doi.org/10.3322/caac.21660
Taira N, Nihira K, Yamaguchi T, Miki Y, Yoshida K (2007) DYRK2 is targeted to the nucleus and controls p53 via Ser46 phosphorylation in the apoptotic response to DNA damage. Mol Cell 25:725–738. https://doi.org/10.1016/j.molcel.2007.02.007
Thakkar A, Chadimová V, Bjerrum EJ, Engkvist O, Reymond J-L (2021) Retrosynthetic accessibility score (RAscore)–rapid machine learned synthesizability classification from AI driven retrosynthetic planning. Chem Sci 12:3339–3349. https://doi.org/10.1039/D0SC05401A
Thayer KM, Beyer GA (2016) Energetic landscape of MDM2-p53 interactions by computational mutagenesis of the MDM2-p53 interaction. PLoS ONE 11:e0147806. https://doi.org/10.1371/journal.pone.0147806
Thayer KM, Han ISM (2017) Chemical principles additive model aligns low consensus DNA targets of p53 tumor suppressor protein. Comput Biol Chem 68:186–193. https://doi.org/10.1016/j.compbiolchem.2017.03.003
Tripathi S, Augustin AI, Dunlop A, Sukumaran R, Dheer S, Zavalny A, Haslam O, Austin T, Donchez J, Tripathi PK (2022) Recent advances and application of generative adversarial networks in drug discovery, development, and targeting. Artificial Intelligence in the Life Sci 100045
Trott O, Olson AJ (2009) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem NA-NA. https://doi.org/10.1002/jcc.21334
Veprintsev DB, Fersht AR (2008) Algorithm for prediction of tumour suppressor p53 affinity for binding sites in DNA. Nucleic Acids Res 36:1589–1598. https://doi.org/10.1093/nar/gkm1040
Vogelstein B, Lane DP, Levine AJ (2000) Surfing the p53 network. Nature 408:307–310. https://doi.org/10.1038/35042675
Vyas P, Beno I, Xi Z, Stein Y, Golovenko D, Kessler N, Rotter V, Shakked Z, Haran TE (2017) Diverse p53/DNA binding modes expand the repertoire of p53 response elements. Proc Natl Acad Sci 114:10624–10629. https://doi.org/10.1073/pnas.1618005114
Wang Y, Rosengarth A, Luecke H (2007) Structure of the human p53 core domain in the absence of DNA. Acta Cryst D 63:276–281. https://doi.org/10.1107/S0907444906048499
Wang G, Bai Y, Cui J, Zong Z, Gao Y, Zheng Z (2022) Computer-aided drug design boosts RAS inhibitor discovery. Molecules 27:5710. https://doi.org/10.3390/molecules27175710
White RE (2000) High-throughput screening in drug metabolism and pharmacokinetic support of drug discovery. Annu Rev Pharmacol Toxicol 40:133–157. https://doi.org/10.1146/annurev.pharmtox.40.1.133
Wild CP, Stewart BW (2014) World Cancer Report 2014. World Health Organization, Geneva, Switzerland
Wilkinson ME, Charenton C, Nagai K (2020) RNA splicing by the spliceosome. Annu Rev Biochem 89:359–388. https://doi.org/10.1146/annurev-biochem-091719-064225
Yoshida K (2008) Role for DYRK family kinases on regulation of apoptosis. Biochem Pharmacol, Special Issue: Apoptosis World 2008(76):1389–1394. https://doi.org/10.1016/j.bcp.2008.05.021
Yuan L, Tian C, Wang H, Song S, Li D, Xing G, Yin Y, He F, Zhang L (2012) Apak competes with p53 for direct binding to intron 1 of p53AIP1 to regulate apoptosis. EMBO Rep 13:363–370. https://doi.org/10.1038/embor.2012.10
Zakut-Houri R, Bienz-Tadmor B, Givol D, Oren M (1985) Human p53 cellular tumor antigen: cDNA sequence and expression in COS cells. EMBO J 4:1251–1255. https://doi.org/10.1002/j.1460-2075.1985.tb03768.x
Zhou S, Chai D, Wang X, Neeli P, Yu X, Davtyan A, Young K, Li Y (2023) AI-powered discovery of a novel p53–Y220C reactivator. Front Oncol 13:1229696. https://doi.org/10.3389/fonc.2023.1229696
Acknowledgements
The authors gratefully acknowledge fruitful discussion with the Molecules to Medicine group at Wesleyan University, especially David L. Beveridge, Michael P. Weir, and Conrad Fischl. We are also grateful for having the opportunity to share aspects of this work with the Ascona B-DNA Consortium. This work was supported by NIH R15 GM128102 to KMT.
Funding
This work was supported by NIH R15 GM128102 to KMT.
Author information
Authors and Affiliations
Contributions
Kelly M. Thayer and In Sub Mark Han conceived the ideas for the article and outlined it. Kelly M. Thayer, Sean Stetson, Fernando Caballero, and Christopher Chiu wrote the article. Kelly M. Thayer, In Sub Mark Han, Sean Stetson, Fernando Caballero and Christopher Chiu contributed to revising and editing the manuscript.
Corresponding author
Ethics declarations
Ethical approval
Not applicable. This is a review article and there are neither human nor animal studies involved.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Thayer, K.M., Stetson, S., Caballero, F. et al. Navigating the complexity of p53-DNA binding: implications for cancer therapy. Biophys Rev (2024). https://doi.org/10.1007/s12551-024-01207-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12551-024-01207-4