Introduction

The idea that proteins fluctuate between many structures of an ensemble is well established [1]. In contrast to the ensemble view, protein X-ray crystallography generally follows a convention, established with myoglobin and hemoglobin in the middle of the last century [2, 3], to describe the model built into the electron density map in the singular form: “we have solved the structure of protein X”. The solved structure generally represents the best fit of a single polypeptide model to the electron density generated by the diffraction pattern of ~1015 molecules in the crystal lattice during seconds to hours of data collection. The protein molecules within a crystal populate a ‘conformational landscape’ dominated by nearly iso-energetic and conformationally similar substates [4]. However, the refinement process, with few exceptions [57], is designed to converge on the single structure.

Here we review how crystallography can be used to reveal conformational substates that interconvert independent of any conformational changes induced by ligand binding or other perturbations. We refer to pre-existing conformational diversity as polysterism (as in [8, 9]), making a semantic distinction between structural diversity (polysterism) and sequence diversity (polymorphism). Early investigations of polysterism that used crystallography and complementary solution methods are highlighted before we focus on several recent studies that have provided new insight into functionally important polysterism, often from a single electron density map. These studies reveal a general theme: the minor substates of the conformational landscape often resemble the structures that play important roles in ligand binding, enzyme catalytic cycles, or evolutionary transitions. Thus, characterizing the polysterism of a single protein can yield insight by identifying minor conformations that may be sampled during evolution or during a reaction cycle.

Crystalline enzymes are often active

Despite the stable lattice arrangement necessary for X-ray diffraction, many enzyme crystals are capable of catalyzing chemical reaction [10, 11]. However, their activity is often reduced compared to the free enzyme in a crowded solution [12, 13]. Because the crowding of a typical protein crystal lattice is similar to the macromolecular crowding in a biological cellular environment [1416], the reduction in catalytic efficiency likely originates from specific conformational restrictions of the crystal lattice. These observations provided some of the first indications that the ability to transition between structurally distinct conformational substates may be a functionally important property of proteins.

Crystallography gives endpoints for an induced fit view

The importance of structurally distinct substates in enzyme function was first promoted through Koshland’s [17] induced fit hypothesis, which proposed that proteins move between discrete structures as a result of interaction with a ligand. Induced fit theory, which was originally based on the necessity for substrate-specific catalytically competent structure of hexokinase to allow activity with glucose and prevent activity with other potential substrates such as water, was generalized to most enzymes and receptors. Koshland’s predictions were later supported by clear structural differences between ligand-bound and unbound crystal structures of carboxypeptidase and hexokinase [1820], providing a structural corroboration for the importance of conformational rearrangements during the catalytic cycle of some enzymes. These structural studies established that solving independent crystal structures with different ligands is a powerful tool that can be used to observe the end-points of conformational change. The importance of conformational change in enzyme function was buttressed by solution experiments such as volume changes calculated from centrifugation and small-angle X-ray scattering experiments [18, 20].

Many conformational substates are present in a protein crystal

Despite an initial emphasis on the single-structure view of proteins, polysterism has been widely appreciated from the beginnings of structure–function studies. For instance, polysterism was a fundamental aspect of the allosteric model proposed by Monod, Wyman and Changeaux (MWC) [21] in 1965, in which allosteric effectors were proposed to function through changing the pre-existing distribution of active or inactive conformational substates within a population of protein molecules. Polysterism was also an important part of statistical mechanical descriptions of protein conformation and function [22, 23], while solvent interaction was predicted to introduce a heterogeneity in proteins at an early stage [4].

Although a role for intrinsic conformational diversity was recognized conceptually, the structural evidence to support it was initially slight. In the early stages of the field of protein crystallography, when high-resolution diffraction was relatively rare, polysterism was often not observable, even if it was present. More importantly, crystallization, by its very nature, favors ordered aggregation of molecules that are homogeneous in composition and conformation, establishing the direct relationship between crystalline order and diffraction. Indeed, large regions of disordered solvent molecules were recognized as a likely origin of X-ray diffraction resolution limits [14]. These observations introduced a paradox—in order to resolve functional conformational heterogeneity, we need high-resolution X-ray diffraction, but intrinsic heterogeneity limits diffraction.

To resolve this apparent contradiction, the pioneering studies of Frauenfelder [24] used an indirect readout of heterogeneity at moderate resolutions: atomic B-factors (also called Debye–Waller factors, temperature factors, atomic displacement parameters, and thermal parameters). The B-factor fits the decay of electron density surrounding the modeled mean position of an atom. Using atomic B-factors, they calculated the magnitudes of the spatial distribution of protein atomic movements without resolving independent discrete conformations. Frauenfelder and Petsko used crystals of myoglobin and variable diffraction temperatures to discover differences between the mobility of core and surface atoms. They observed that the magnitude of motion varied according to the distance from the central haem-group. Building upon this observation, they produced a model of a rigid core surrounded by semi-liquid regions to the outside, and suggested a possible ‘dynamic’ pathway for substrate and product diffusion. This analysis suggested that conformational substates, particularly for side chains, exist within the crystal and that temperature can affect the magnitude of motion within these substates. Concurrently, Phillips and coworkers [25] demonstrated a conservation of the distribution of B-factors between homologous proteins crystallized in different space groups. Since the homologous proteins crystallized in different lattice arrangements, their results suggested that B-factor distributions are, at least partially, independent of the crystallization process. Additionally, the conservation of high B-factors at the entrance to the active site suggested that the protein mobility might have been maintained throughout evolution and play a significant part in the enzymatic activity. These groundbreaking papers provided structural support for the role of polysterism and heralded a new view within protein crystallography: there are interconverting structures within the crystal, revealed by deeply considering experimental data, beyond the singular model of the protein.

Evidence for conformational substates from solution

While crystallography has remained popular for independently revealing interconverting structures, substantial evidence for the importance of polysterism has come from solution methods. In particular, nuclear magnetic resonance (NMR) relaxation-based methods have opened up the field of protein dynamics at a wide range of time scales [26]. Order parameters of fast (ps–ns) time scale dynamics can lead to degenerate models of motion [27]. These measurements are limited to backbone amides and specific side-chain residues and do not provide any information about coupling or correlated motion between sites [28]. Intermediate (μs–ms) time scale motions revealed by CPMG (Carr Purcell Meiboom Gill, R2) or R1ρ relaxation dispersion methods yield information about two- or three-state differences in chemical shift, population, and exchange rates [29]. However, since distinct states are inferred due to differences in chemical shift, many types of motion can be invoked to explain the signal. Further, movement of surrounding residues, rather than the residue being probed, can drive the chemical shift differences. The simultaneous fitting of relative population, exchange rates, and chemical shifts is generally underdetermined for an individual residue. This is normally overcome by implying a collective motion for a group of residues and jointly constraining their relative populations and interconversion rates [26]. NMR relaxation methods can identify groups of residues that likely experience exchange between states, but these methods do not provide a reference to a causal structural mechanism of the dynamic signal.

The power of these techniques was demonstrated in experiments that probed the allosteric behavior of the signaling protein nitrogen regulatory protein C (NtrC). Relaxation methods demonstrated that phosphorylation affects a population shift between two pre-existing conformations, providing strong validation of the MWC model of allostery [30]. Mutational perturbations to the structural transition pathway predicted by molecular dynamics simulations confirmed that phosphorylation stabilizes the minor conformation of the unphosphorylated state [31]. Later in this review, we describe three enzymes where different classes of structural rearrangements have been probed by relaxation methods and how crystallography can be leveraged to provide a structural interpretation of the dynamic signal.

Connecting polysterism to enzyme function

Much of the effort in protein structure–function studies over the past decades has focused on defining structural folds and static active site chemistries. Considerably less energy has been devoted to defining the role of polysterism and conformational change. This allocation of effort is understandable since studying the full conformational landscape of an enzyme before understanding its catalytic mechanism and structural fold would not provide particularly useful information. Another reason for the comparative neglect of polysterism in the literature is that experimentally characterizing scarcely populated states of the ensemble remains difficult and that linking these minor conformations to function is even more challenging [32, 33].

However, several recent developments have lead to the increased study of the role of polysterism in enzyme function. Our understanding of active site chemistry has now developed to the level where functional active sites can be designed de novo [34, 35]. Concurrently, many technical advances in crystalline and solution structural studies have made studying dynamics or polysterism feasible for a majority of proteins [36]. This has meant that simply describing a structure and putative catalytic mechanism is now only the beginning of many structure–function studies. This review examines ‘protein structural dynamics’ by focusing on the detection of protein polysterism and its relevance to the equilibrium steps of protein function and evolution. The dynamic, or time-resolved, aspect and the role of conformational change in the chemical step of catalysis are beyond the scope of the review and have recently been reviewed elsewhere [32, 33]. The challenge over the coming years will involve understanding how conformational substates, and the dynamic transitions between them, function during all steps of the catalytic cycle of an enzyme. We will now discuss six recent examples where this challenge has been answered.

Dihydrofolate reductase; polysterism through a catalytic cycle

Dihydrofolate reductase (DHFR) catalyzes transfer of hydrogen from NADPH–dihydrofolate to yield tetrahydrofolate, a required step in the synthesis of purines and some amino acids. This central role in metabolism also makes DHFR an attractive anti-cancer and anti-bacterial drug target. Induced fit interpretations of DHFR conformational changes observed by crystallography at different states of the catalytic cycle by Kraut and coworkers [37, 38], led to an increased understanding of the catalytic mechanism. Subsequent NMR experiments showed, with reference to the crystal structures, that diverse conformational fluctuations, including hydrophobic core rotamer exchange [39] and loop motions [40], are pre-existing in each functional state and may help to drive the transitions during catalytic cycle [41]. The linear agreement of high-energy chemical shifts from relaxation dispersion experiments and ground-state chemical shifts of the neighboring steps of the reaction cycle implies that polysterism helps to bias the enzyme conformation along the reaction trajectory (Fig. 1a). Furthermore, the agreement between the rate constants of the reaction and conformational sampling supports the idea that protein motions, rather than active site chemistry, limits turnover.

Fig. 1
figure 1

Catalytically important motions detected by NMR relaxation experiments can arise through different structural mechanisms. a Polysterism revealed by comparison of NMR chemical shifts with reference to independent crystal structures in DHFR (1RC4 and 1RB2). The Met20 loop of DHFR samples a minor conformational substate (shown in orange) when bound to the substrates NADP+ and THF. The minor state chemical shift is correlated with the chemical shift of the major conformational substate for the immediately preceding state of the catalytic cycle. b Independent crystal structures of AdK in the apo/open state (blue 2RH5) and nucleotide-bound/closed state (yellow, with nucleotide in red 2RGX) reveal that subdomain lid closure underlies the NMR relaxation signal. In an orthogonal view, independent chains from one crystallographic asymmetric unit (blue, light blue, and cyan 2RH5) reveal that the transition path towards the closed state (yellow) is populated even when no nucleotide is bound. Residues on mobile loops are shown in stick and sphere to provide a visual landmark of the polysterism. c In the room-temperature structure of CypA (3K0N), interconverting side-chain conformations give rise to an NMR relaxation dispersion signal. The minor (orange) and major (red) conformational substates extending from the core (left) to the active site Arg55 (right) are shown within electron density mesh. Examining the electron density at low levels (light blue 0.3σ) below normal contour (dark blue 1σ) reveals the minor conformational substate

The dynamic properties of DHFR have lead to new therapeutic and protein engineering strategies. A small molecule that breaks the coupling of dynamic features acts as an efficient inhibitor of the enzyme [42]. The fusion of a PAS domain to a specific surface site of DHFR allosterically couples dynamic loop regions and a network of internal residues that promote catalysis [43]. This strategy generated a novel allosteric circuit that allowed for light/dark-control of DHFR enzymatic activity [44]. Domain fusions that bias the conformations of flexible active site loops may represent a general strategy for engineering allosteric control of enzyme function. For other enzymes, the challenge is to define polysteric regions and evolutionarily conserved sites that can be used to mechanically couple allostery, conformational flexibility, and enzyme function.

Antibodies; polysterism leads to binding plasticity

The work on antibody specificity by Tawfik and colleagues [45] has provided clear evidence of functionally relevant polysterism, through the use of crystallography and fast-kinetic analysis of pre-equilibrium states. Crystallographic analysis of catalytic antibodies raised against a transition-state analogue of carboxylesters indicated that the structure of the ligand-free form of the antibody was essentially identical to the structure of the antibody-TS analogue complex. These results led to the conclusion that catalytic antibodies most likely follow a simple lock-and-key mechanism in ligand binding [46, 47]. However, pre-steady-state kinetic experiments revealed that the antibodies exist in a pre-equilibrium between distinct conformational substates and that ligand binding induces an equilibrium shift to the bound state conformation [48]. Moreover, the crystallization conditions were found to bias the conformational distribution of the apo-enzyme towards the bound-like state, explaining why only this configuration was observed by X-ray diffraction.

The model that ligand binding stabilized selected conformations of a pre-existing equilibrium was further tested through analysis of another antibody, SPE7. SPE7 was raised against a small molecule hapten (2,4 dinitrophenyl) and also found to bind the protein antigen (Trx-Shear3) [49, 50]. Pre-steady-state kinetics again established a pre-equilibrium consisting primarily of two conformations that could be altered through the addition of hapten or protein antigen. The kinetic data was supported by crystal structures of the two major pre-equilibrium states and of antibody:hapten and antibody:protein complexes. As shown in Fig. 2, the major (AB1) and minor (AB2) ligand-free substates respectively resemble the protein (AB4) and hapten (AB3) bound structures. These similarities are most apparent in the light chain of the antibody, where a cleft at which the hapten binds is formed by Y34, W93, and N96 in AB2/AB3, but is largely absent in AB1/AB4 as a result of different conformations of these side chains. Notably, despite this selection of pre-existing conformational substates by hapten- or protein-binding, the final bound structures still display significant differences with the unbound states. This suggests that some induction of conformational change still occurs. This aspect was subsequently addressed in greater detail when the interactions between SPE7 and haptens with high and low affinity were tested. Both high- and low-affinity haptens were observed to form identical transition complexes with similar affinity, yet only the high-affinity hapten formed hydrogen bonds with previously unexposed parts of the antigen to allow the final bound form to be ‘locked in’ [51]. These examples demonstrate that ligand binding selectivity (and promiscuity) can originate from inherent conformational polysterism and that both selection of pre-existing states and induced conformational change can play a role in protein–ligand interaction.

Fig. 2
figure 2

Hapten or protein binding selects pre-existing conformations of the antibody SPE7. Two distinct conformations of unbound SPE7 (AB1: 1OAQ; AB2: 1OCW) have been characterized in which part of the binding site consisting of Y34, W93 and N96 of the light chain are in different conformations. Owing to the orientation of W96, the binding site of the major conformation, AB1, is considerably ‘flatter’ than AB2. Ligand binding selects from these pre-existing substates, with protein-binding (AB4: 1OAZ) maintaining the AB1-like conformation of these residues and hapten-binding (DNP-Serine; AB3: 1OAU) maintaining an AB2-like conformation

Acetylcholinesterase; polysterism from simulations and crystallography

Acetylcholinesterase (AChE) has been studied in great detail owing to its essential role in nerve signal transduction [52] and its inhibition by pharmaceuticals [53] and both natural and synthetic toxins [54, 55]. It is one of nature’s most efficient enzymes, with high turnover (k cat) values ~105 s−1, and k cat/K M values (~109 s−1 M−1) that approach the diffusion limit [56]. When the crystal structure of AChE was first solved in 1991, the unexpected observation that its active site was located at the base of a very deep and narrow gorge [57] seemed at odds with the magnitude of its kinetic parameters and indicated that conformational change might play an essential role in substrate and product diffusion to/from the active site. In addition to ‘breathing’ of the active site cleft [58], a subtle conformational rearrangement of side chains occurs during the catalytic cycle of AChE that promotes efficient catalysis and substrate and product diffusion. These discoveries have emerged from a series of X-ray crystallography experiments and molecular dynamics simulations. Three amino acids have been found to be prominently polysteric: F330, W279, and W84 (Fig. 3).

Fig. 3
figure 3

Polysterism in the active site and peripheral site of acetylcholinesterase. The gateway to the active site from the peripheral anionic site in AChE is formed by F330 and Y121. The positioning of substrate in the active site and peripheral anionic site is shown by the substrate analog OTMA from the structure 2C5F. Several structures of AChE are overlayed showing the conformational flexibility of F330. These include: 2C4H (green), 2C58 (cyan), 2C5F (magenta), 2C5G (light yellow), 2V96 (light grey), 2V98 (orange), 1EA5 (pink), 2CEK (grey), 1ACJ (dark yellow). Polysterism in the ‘backdoor residue of the active site in shown by the structures 1EA5, 2V98, and 2V96. Polysterism of W279 at the peripheral anionic site is shown by 2CKM (purple), 1EA5 and 1ACJ

Colletier et al. [59] studied the migration of acetylcholine into the active site using substrate analogues and exploiting substrate inhibition. They observed that in native crystal structures, the cross section between F330 and Y121, what is known as the active site ‘gate’ is only 5 Å; less than the diameter of the quaternary group of the substrate acetylcholine. A series of crystal structures were acquired in which different rotameric states of F330 were observed to drastically change the size of the active site gate. These conformational changes were found both to control access to the active site by fluctuating between open and closed conformations and to provide cationic interactions with the substrate during the chemical step in the catalytic cycle. Different rotameric states of W279, located in the peripheral anionic site, have also been observed as a result of ligand binding, in particular with bi-functional inhibitors that bind at both the peripheral anionic site and the active site [60]. This polysterism and its role in ligand binding has been referred to as “dynamic selectivity” by McCammon and colleagues [61] who observed its effects through molecular dynamics (MD) simulations. Finally, the active site residue W84 may fluctuate through rotameric states that allow product release through a “backdoor”, which would eliminate competition with incoming substrate for entrance to the active site through the narrow gorge [62]. While structural comparisons have revealed heterogeneity in this region through static [55, 63], and kinetic [64], crystallography experiments and molecular dynamics simulations [67], mutagenesis studies have been inconclusive [65, 66].

The polysterism discussed above was observed through analysis of many separate crystal structures; it was not known whether these conformations existed in the absence of ligands. To address this question, Xu et al. [67, 68] performed a series of molecular dynamics simulations and compared the sampled geometries with the many apo-enzyme and ligand bound structures of AChE available in the literature. They found that the apo-enzyme continuously samples almost all of the side-chain conformations observed to result from ligand binding. These results suggest that, in AChE at least, conformational fluctuations provide access to a polysteric landscape of distinct conformational substates that possess unique catalytic properties tailored to specific steps along the catalytic cycle and have different ligand affinities.

Adenylate kinase; subdomain polysterism from a single crystal

Adenylate kinase (AdK) maintains the balance of cellular energy by catalyzing the interconversion of adenine nucleotides. Alignment of the apo and nucleotide-bound crystal structures of AdK revealed a very large conformational transition [69]. The apo-structure resembles an open hand with the finger and thumb lids (Fig. 1b). In structures of the enzyme bound to its nucleotide substrates, the two lids close down around the active site providing a favorable electrostatic environment for phospho-transfer. NMR studies showed that the conformational lid rearrangement, rather than the chemical phospho-transfer step, limits the rate of the catalytic cycle in homologous AdKs [70]. Given the importance of lid flexibility, several studies have examined how motion is induced by substrate.

Surprisingly, convincing evidence that lid rearrangement is pre-existing in the apo-enzyme came from comparisons of independent chains within a single crystal (Fig. 1b) [71]. In crystals obtained by Henzler-Wildman et al., three independent copies of apo-AdK each experience a different crystal-packing environment. The changes in crystal contacts allowed the two lids to rotate about several hinges. The phenomenon of hinge-bending in multiple chains from a single crystal was first observed in a mutant T4 lysozyme [72]. For AdK, the alignment of the independent copies suggested a concerted movement of the lids towards the closed conformations even in the absence of substrate. Adk represents an example of polysterism observed through multiple chains within a single crystal. The conformational heterogeneity observed in the crystal resembles the solution trajectory predicted by computational analysis of crystal structures of the Adk in different nucleotide-bound states.

A comprehensive suite of single-molecule FRET, NMR, and computational studies further confirmed the relevance of crystalline polysterism in AdK. These studies concluded that the enzyme could harvest the directionality of pre-existing motions during a catalytic cycle [71, 73]. This result builds upon the theme that substrate binding does not induce entirely new structures; rather, binding shifts the populations of substates that pre-exist in the ensemble. Although this work defined the apo-state energy landscape, many detailed mechanistic questions remain about how the lid movements are controlled. Particularly, questions remain as to whether lid closing involves local unfolding [74] and the ratio of populations of the open and closed forms in different nucleotide states [75]. AdK remains an excellent system for examining the relevance of enzyme conformational heterogeneity to the catalytic cycle. Polysterism, from multiple independent chains in the same crystal, was key to unraveling this relationship in AdK. However, molecular dynamics simulations and NMR experiments provide the necessary time-scale resolution to augment the functional importance of the pre-existing conformational substates.

Cyclophilin A; polysterism in electron density at different temperatures

Cyclophilin A (CypA), catalyzes the isomerization of proline residues in proteins, which can regulate protein folding, signaling, and molecular assembly. CypA, like AdK and DHFR, undergoes motions detectable by NMR during catalysis that correspond to the catalytic turnover rate [76]. Furthermore, motions in CypA are pre-existing in the active site, hydrophobic core, and loop regions whether or not substrate is bound [77]. However, unlike AdK or DHFR, comparison of X-ray structures did not suggest a mechanism for the conformational diversity in the active site and core regions detected by NMR. In other systems, large backbone movements of loops or hinge-based subdomain reorientations provide the chemical shift differences detected by relaxation dispersion experiments. Although dynamics of the loop regions of CypA arise because of this conventional motion, the active site and core motions are surprisingly caused by a coupled network of interconverting side-chain substates [78] (Fig. 1c).

The minor state side-chain conformations of apo-CypA were discovered by using the program Ringer [5] to sample low electron density levels in a map obtained from data collected at room temperature. Interestingly, conventionally frozen crystals only displayed evidence of the major state. The observation of these substates confirms a link between crystal data collection temperature and substate population [79]. Due to the limited number of enzymes where structures are available at multiple temperatures, it is unclear whether room temperature data collection universally leads to a more accurate description of conformational substates. For example, in alpha-lytic protease, higher resolution cryogenic data collection enabled a more complete description of the conformational landscape than a room temperature structure [80].

A remote mutation designed to test the importance of the side-chain substates in CypA not only flipped the population of the substates, but also caused a parallel decrease in the rate of conformational interconversions measured by NMR and the rate of catalysis [78]. This and other mutations tested with the HIV capsid as a substrate [81] suggest that the catalytic cycle of CypA is limited by conformational transitions that are populated in the apo-state. Interestingly, evolution has harvested CypA polysterism as a mechanism for species restriction in HIV. Elegant structural and thermodynamic studies revealed the mechanism that targets a retrotransposed mutant CypA in rhesus macaque monkeys to bind specifically to HIV-2 capsids [82]. Two mutations, located in the loop regions identified as polysteric by NMR relaxation dispersion analysis, dramatically reshape the dominant conformation of the binding pocket. This study further showed a correspondence between loop conformation, binding selectivity, and sensitivity to viral infection. Since wild-type CypA binds the capsid of both HIV-1 and HIV-2, the conformation stabilized by the mutations is likely present in the wild-type ensemble. Thus, CypA represent an example of how polysterism, revealed through comparison of multiple crystal structures, can be used in the evolutionary arms race between host and pathogen by sculpting loop conformations to determine binding specificity. The binding selectivity imparted by loop motion complements the network of residues extending from the core to the active site that undergo side-chain rearrangements that are necessary for turnover [78].

Bacterial phosphotriesterase; polysterism promotes evolution

The bacterial phosphotriesterase (PTE) has attracted considerable attention as a result of its rapid evolution of extremely high catalytic efficiency for the breakdown of human-made pesticides such as paraoxon. Raushel and coworkers [8385] have established that the catalytic cycle is comprised of separate steps involving formation of the Michaelis complex, the chemical step and product release and that for reactive substrates, such as paraoxon, the slowest rate is product release or conformational change. Recently different states along the reaction coordinate have been captured in crystallo, providing new structural insight [11, 86].

Although the enzyme:substrate and enzyme:product complexes of PTE are consistent with a highly efficient chemical step in the catalytic cycle, it was observed that, as in AChE [59], the entrance to the active site was smaller in diameter than the cross section of the substrate/product [87]. This observation pointed to an important role for conformational change in the catalytic cycle. To investigate this in more detail, a series of crystal structures of mutants of PTE were solved in which the active site was observed to exist in equilibrium between ‘open’ and ‘closed’ conformations that are ideally configured for non-chemical (substrate and product diffusion) and chemical (catalysis) steps in the catalytic cycle, respectively (Fig. 4). Different mutations were found to affect the distribution of open and closed states, and through this the rate of product release. It should be noted that the observation and building of these minor geometries was performed through human inspection of electron density maps. As discussed in a later section, recent technical developments have enabled detection of much more, previously ‘hidden’ polysterism, automatically. This study characterized the conformational landscape of the PTEs and linked these states to different steps in its function, but more importantly demonstrated that mutations accrued through evolution can alter activity by modulating the energy barriers, and thus the accessibility, of different conformational substates required for function.

Fig. 4
figure 4

Major and minor conformations of the bacterial phosphotriesterase. The major ‘closed’ conformation of PTE is shown in cyan, while the minor ‘open’ conformation is shown in grey (3A4J). Both conformations were modeled into the electron density of an apo-enzyme crystal. The substrate Z-chlorfenvinphos is shown docked based on known enzyme:substrate complexes (2R1N). It can be seen that in the ‘closed’ conformation, several side-chains adopt rotamers that extend into the active site, greatly reducing its volume. One of these, F132, is shown to sterically clash with Z-chlorfenvinphos in its dominant conformation but to allow binding in its minor ‘open’ conformation

A similar story is seen in the role of polysterism in promiscuous, or non-native, PTE activity. Previous work has shown that the slow turnover of the pesticide chlorfenvinphos was principally a result of the active site gorge of the enzyme being too narrow to accommodate the substrate [88]. Mutation of a phenylalanine to an alanine opened the active site gorge and resulted in a 500-fold increase in catalytic efficiency. The question remained: how does the enzyme catalyze hydrolysis of the chlorfenvinphos at all when there is no possible binding mode? The description of the multi-state equilibrium model of the phosphotriesterase provides an explanation [87]: the primary residue that blocks chlorfenvinphos binding (F132) is observed to adopt a minor conformation in which its ring is flipped out of the active site in such a way as to allow productive chlorfenvinphos binding. This therefore provides a straightforward example of the hypothesis put forward by Tokuriki and Tawfik [36], which posited that conformational diversity can provide important starting points for evolution. This also demonstrates that polysterism can allow promiscuous activities to be catalyzed by minor conformational substates that may not be used for the analogous step in the native catalytic cycle.

Glucocorticoid receptor; polysterism in allosteric signaling

The importance of minor states revealed by weak electron density extends beyond catalysis. For example, the allosteric effect of DNA binding to the glucocorticoid receptor (GR) manifested in alternate conformations, observed in weak electron density, that modulate the responsiveness to hormone treatment (Fig. 5) [89]. A series of crystal structures of GR bound to different DNA sequences revealed polysterism in the lever arm. These structures demonstrated the existence of conformational substates in this region through comparisons between structures and also within individual electron density maps. Allosteric modulation of GR conformational substates alters the recruitment of the transcriptional machinery and leads to different transcriptional activation, independent of the effects of binding affinity to the DNA sequence. These studies highlight the critical role that crystallography will continue to play in characterizing the importance of minor conformational substates in many aspects of protein function, from catalysis to signaling.

Fig. 5
figure 5

DNA binding allosterically modulates lever arm conformations and the transcriptional activation of glucocorticoid receptor. a The glucocorticoid receptor bound to the Sgk-sequence DNA (3G9P) adopts a dimeric structure with polysterism in the lever arm. b The electron density (dark blue 1σ, light blue 0.3σ) for the chain A (yellow) lever arm is well fit and suggests only small amplitude motion around the major conformation. c In contrast, despite originating from the same map, the electron density for the lever arm of chain B (black) for this area reveals potential polysterism. Although the lever arm is modeled into a different conformation, there is much unexplained density at low levels. d The conformations of the lever arms of chain A (yellow) and chain B (black) represent the major structure difference between the two chains. Low levels of electron density (light blue 0.3σ) shown from the area of chain B suggest that the conformation modeled for chain A may represent a component of the ensemble for both chains

Technical challenges and recent advances

Given the importance of conformational substates in evolution and the catalytic cycles of enzymes, the question remains: what are the best ways to reveal these critical minor states? Clearly the best description of the native ensemble comes from a combination of methods, often including NMR relaxation techniques, and experimental tests of the importance of the observed polysterism. Here, we have reviewed several ways that protein crystallography can provide insight into spatial distribution of conformational substates. For example, structural changes can be revealed through independent crystal structures (DHFR, Antibodies, AChE), comparisons of multiple chains in the same crystal (AdK), or by examining the electron density for minor conformations (CypA, PTE, GR).

Recently developed crystallographic refinement methods aim to describe the ensemble populated in the crystal. These techniques complement the traditional B-factor calculation, which gives an indication of the small-amplitude flexibility around a ground state. Although there is often a concordance between B-factors and fast small-amplitude harmonic motions in solution [90], several alternative methods have been proposed to explore more dramatic motions that can be difficult to interpret in “messy” electron density. For example, automated rebuilding and refinement techniques can create an “ensemble of independent models” [91, 92]. This family of models is generally a fixed number of independent models that do not simultaneously contribute to the refinement process. These methods have been used to suggest that many models can equally explain the diffraction data [92]. A systematic investigation of an “ensemble of models” created with synthetic data revealed that independently refined models represent the uncertainty inherent in a single-model description of the possible conformational space populated in the crystal [91].

To more directly model the multiple structures populated in the crystal, many “multi-copy ensemble” refinement methods have been developed [7]. In “multi-copy ensemble” procedures, a fixed number of interdependent models are simultaneously refined [7]. For certain anharmonic polysteric regions, this procedure can lead to a much better description of the electron density. However, given the limited number of experimental observations in crystallographic data, for many residues it may not be necessary to introduce the added parameters needed to refine locally similar regions for the multiple copies of the ensemble.

At high resolution, it may be more advantageous for many residues to be modeled as in a single conformation with anisotropic B-factors rather than as a part of a “multi-copy ensemble”. For other residues, multiple alternative conformations may be needed to fit the electron density. This is traditionally accomplished through manual inspection of electron density maps, which often leads to an under-estimation of the extent of polysterism [5]. An algorithm, qFit, that implements such a strategy in an automated fashion has recently been implemented [6]. qFit generates a “multi-conformer model” that is generally a single model. However, for select residues, alternative backbone and side-chain conformations are modeled into the electron density. This method gives qualitatively similar results to iterative electron density sampling and manual rebuilding and can capture complex coupled motions that extend over many residues [93].

The automation provided by qFit augments existing strategies for discovering and modeling alternative conformations. For example, electron density sampling as implemented in Ringer [5], reveals that many alternative conformations exist at electron density levels that have been traditionally considered noise by human model builders. Ringer samples electron density in dihedral space and discovers peaks indicative of unmodeled alternative conformations. These alternative conformations can be built after manual inspection of the electron density map and further refined for occupancy by phenix.refine [94]. As seen in PTE and CypA, alternative conformations, revealed at low levels of electron density (generally 0.3–1σ), can be critical for enzyme function.

Another means by which crystallography can access protein polysterism is the analysis of thermal diffuse scattering [95, 96]. Diffuse scattering is the remaining X-ray intensity that is recorded in diffraction images once the Bragg reflections are subtracted. While analysis of the Bragg reflections can give accurate estimates of the individual atomic mean squared displacements and alternate conformations, the diffuse scattering can provide valuable insight into isotropic liquid like displacements [97] and strongly correlated rigid body movements of groups of atoms [98]. This can reveal how particular groups of atoms are displaced together as rigid bodies. Techniques that predict low frequency correlated rigid body movements, such as principal component analysis of molecular dynamics simulations or normal mode analysis of elastic network models, can be validated by analysis of diffuse scatter [99, 100]. This theoretical–experimental cross validation approach can also serve to support or exclude other models of correlated motion used in refinement, such as translation-libation-screw (TLS) refinement [101, 102]. Whether analyzing diffuse or Bragg scatter, access to primary image data would greatly increase the ability to re-examine biological conclusions. Although deposition of scaled reflection data, which permits re-calculation of electron density maps for analysis by Ringer and other software, is now standard for almost all journals, image data are rarely deposited [103]. A major limitation to deposition of image data is storage space, but several institutions, structural genomics consortia, and synchrotrons have started voluntary image deposition. As interpretation of atomic displacements becomes more important to our understanding of protein function, our reliance on the re-analysis of primary data is likely to increase in coming years.

Conclusions

Polysterism has contributed to our understanding of how new functions evolve from existing functions. Many studies have now demonstrated that an ensemble of interconverting structures, rather than a single structure, exerts the function of the protein. Thus, evolutionary selection does not optimize only an active site or the stability of a single structure. Rather, it shapes an entire conformational landscape comprising many sub-structures, their relative populations, and their rates of interconversion. Because mutations may affect the dominant structure modeled in a typical crystal structure or “hidden” minor conformations, polysterism must be carefully considered when considering the role of remote mutations on both catalysis and allostery [104]. If we can accurately describe the conformational landscape of proteins, we might be able to better predict the nature and conformations of transitions between the dominant structures as a protein is perturbed. Realizing this dream has implications for understanding protein evolution and designing better enzymes and inhibitors [105].