Introduction

Allosteric modulation of proteins, discovered more than fifty years ago1, plays an important role for many processes, from signal transduction2 to transcriptional regulations3.

This phenomenon regards any molecular event in which the binding of a ligand in a specific region of a protein (allosteric binding site) affects the stability of distant “primary” binding site (often referred as orthosteric binding site)4,5. The term allosteric is coined for the first time by Monod, Wyman and Changeux within the WMC model6. According to it, proteins can assume two different conformations, each of them exhibiting different binding affinity for the orthosteric ligand. The allosteric ligands can affect the thermodynamic stability of this conformations modifying, consequently, the orthosteric binding affinity. This model implies a protein conformational change mediating the interaction between the two distant sites. However, it has been shown that the allosteric binding site can affect the orthosteric binding also without involving any conformational rearrangement7, modifying, for instance, proteins vibrations around their thermodynamic stable conformations7. These different types of allosteric modulation share a common and peculiar feature, namely the presence of a long-range communication between two different sites of a protein8. Allosteric modulators can be distinguished on the basis of their contribution to the free energy of binding. In the case that such contribution is manly enthalpic, (Type I), it usually leads to a conformational change9. In contrast, when entropic contribution is prevalent (Type II), molecular vibrations are mainly affected. Usually then, no appreciable conformational changes are observed9. In this work we will consider only this case. Finally, cases in between the first two types are also possible9. Allostery can be exploited for drug development. Indeed, allosteric ligands can cause an increase the orthosteric ligands’ affinity for their protein target10. These ligands (“Positive Allosteric Modulators”, PAMs) turn out to decrease the dissociation constant (Kd) of the orthosteric ligand upon target binding. In contrast, Negative Allosteric Modulators (NAMs) increases Kd while Neutral Allosteric Ligand (NALs) do not affect Kd11. Understanding the molecular basis of allosteric modulation and identifying the key elements which make this phenomenon possible is therefore of great importance for both basic science and drug discovery.

We use as a test case the M2 muscarinic acetylcholine receptor (M2), for which large structural information are available. This receptor belongs to the class A of the G-protein Coupled receptor (GPCRs) superfamily. GPCRs are transmembrane proteins which trigger a signal cascade inside the cell upon binding with the G-protein. The allosteric binding occurs prevalently when the receptor assumes a particular conformation called active state12. In most cases, the active state is promoted, by the binding of small molecules called agonist13. Here we study the binary complex between M2 with its agonist iperoxo as well as the ternary complex between the receptor, iperoxo and the PAM LY211960, whose X-ray structures have been determined (Fig. 1a,b).

Figure 1
figure 1

M2 X-ray Crystal Structures. (a) M2 muscarinic acethylcholine receptor bound to an orthosteric agonist (Iperoxo), in green. (b) M2 muscarinic acethilcholyne receptor bound to the allosteric modulator (LY211960), in orange, and the agonist. Both complexes are embedded in a membrane (See methods). These figures have been created with VMD 1.9.3 software package39. (http://www.ks.uiuc.edu/Research/vmd/).

GPCRs possess very complex energy landscape presenting several stable conformations14. Allosteric modulators can modify this landscape affecting the stability and/or the dynamical properties of each conformation8. In this respect, our study focuses on two stable states of a GPCR/ligand adduct. The two states are expected to be two free energy minima. It reasonable to assume that they are active states15. The two X-ray structures are not too dissimilar, indeed, their RMSDs differ by only about 1 Å and this difference remains small during a molecular dynamic simulation (see Fig. SI2). Therefore, the allosteric modulation, in this particular case, does not seem to involve a significant shift in the population of conformational states. Rather, it causes a local reshape of the energy landscape of the same conformational state16. Investigating cases in which the allosteric modulation leads to a population shift in the conformational states is beyond the scope of this paper.

Therefore, in this case, it is reasonable to think the allosteric modulation occurs by means of molecular vibrations of the protein involving the transfer of the associated energy, the vibrational energy (Evib). In the following, the allosteric modulation will be indeed modeled as exchange of Evib among protein residue. This study will be focused only on this specific Type II PAM since sufficient experimental data and studies are available15,17,18.

Results and Discussion

As stated above, in this work we describe the allosteric modulation as communication which consist in the Evib exchange among protein residues. This exchange is modeled as a diffusion process to which a master equation can be associated with.

$$\frac{d{\boldsymbol{p}}(t)}{dt}={\boldsymbol{p}}(t)\cdot {\boldsymbol{L}}$$
(1)

p(t) is a N-entries vector, where N is the number of residues. Each of its entries represents the percentage of total Evib relative to each residue at time t (e.g. \({p}_{i}(t)=\frac{{E}_{vib}^{res\,i}(t)}{{E}_{vib}\,}\), where \({E}_{vib}^{res\,i}(t)\) is the vibrational energy of the i-th residue. Obviously, we do not the exact value of \({E}_{vib}^{res\,i}(t)\) and Evib but just their ratio pi(t)). L is the transition frequency matrix, whose elements Lij are the frequencies associated with transition times τij, which are the characteristic time to exchange Evib between residue i and j, as calculated in reference19. Each element of L is defined as:

$${{\boldsymbol{L}}}_{ij}=\{\begin{array}{c}\frac{1}{\,{\tau }^{ij}\,}\,if\,i\ne j\,\\ -\sum _{k\ne i}\frac{1}{\,{\tau }^{ik}\,}\,if\,i=j\end{array}$$
(2)

The Master Eq. (1) defines a Transition Network (Markov State Model or simply Markov chain) in which nodes are residues connected via edges whose weights are the element (Lij) of the matrix. L can be decomposed in its eigenvectors. Each of them is associated to an eigenvalue, i.e. a frequency. The eigenvalue zero represents the thermal equilibrium, in which all the residues have the same amount of Evib, and the corresponding eigenvectors (p) is the so-called stationary distribution since dp/dt = p·L = 0.

All the other eigenvectors represent metastable states, through which the system pass by to reach the thermal equilibrium and they “exist” for a finite period of time. In other words, they describe states in which some residues possess a larger Evib than others. This difference causes the energy exchange that we decided to employ for modeling the allosteric communication. Therefore, identifying group of residues which can retain more Evib during the process of reaching the equilibrium can be highly interesting because they might represent biologically relevant regions and studying Evib exchange among them will shed light on how allosteric modulation occurs in protein. Therefore, in the following, we will identify first this regions and study how the Evib exchange occurs among them.

Our analysis focuses here on all the residues of the M2 (Fig. 1), except those belonging to the third intracellular loop. Due to its high mobility, indeed, the approximations needed for calculating τij no longer holds19 and all the possible values becomes unreliable. The loop is included in our MD simulations (see Methods). For the β2 adrenergic (another class A GPCR such as the M2 receptor), this loop is a distinct domain not affecting the seven helices bundle structural features20. One can reasonably expect the same for the M2 receptor, because of the structural similarity of the two receptors.

Here, we consider both the binary (receptor/iperoxo) complex with the agonist iperoxo and its the ternary complex, where also the allosteric ligand LY211960 is present.

We first used the PCCA+21 approach to identify clusters of the residues related to metastable states. This requires to set the number of clusters (n), using a ‘goodness’ criterion: A specific parameter, θ, which depends on the number of clusters n, should be as close to zero (optimal value) as possible. It turns out that setting n = 2 leads to θ = 0, while is much smaller 0 for n > 2 (See methods). Therefore, setting n = 2, protein residues are distributed among two clusters and each of them can be assigned to one cluster calculating the “grades of membership” for each residue. This ranges from 0, when the residue does not belong to the specific cluster, to 1, when the residue belongs only to the cluster (Fig. 2). Here, we establish a cutoff beyond which the residue is assigned to a specific cluster. If the cutoff is set to 0.7, the two resulting clusters turn out to include residues crucial for the binding of the cognate G-proteins12 as well as the allosteric and orthosteric ligands15,22 (see Fig. 3). These clusters have been identified both in the binary and the tertiary complex. Choosing a higher cutoff such as 0.8, 0.9 turns out not to significantly affect our results (data not shown).

Figure 2
figure 2

Clusters Grade of membership. Grade of membership for the two identified residue clusters, distinguished by the color of the lines.

Figure 3
figure 3

The two identified clusters. The orthosteric ligand depicted in yellow for showing the location of the two binding sites with respect to the clusters. This figure has been created with VMD 1.9.3 software package39. (http://www.ks.uiuc.edu/Research/vmd/).

Those residues playing a role in allosteric or orthosteric binding might be instrumental for drug development, therefore, their relevance to Evib exchange is here investigated. According to PCCA+, the two recognized clusters represents regions in which the Evib exchange processes among residues within the cluster is faster than the Evib exchange between the clusters themselves. This finding agrees with previously reported computational results showing that residues belonging to the regions identified by the clusters feature shorter characteristic time for Evib exchange19.

The energy current F between the two clusters, calculated using the so-called Transition Path Theory (TPT) (see Methods), increases on passing from the binary (Fag hereafter) to the ternary complex (Fall). The relative percentage change (ΔF) between the two complexes, which reads:

$$\Delta F=\frac{{F}_{all}-{F}_{ag}}{{F}_{ag}}\cdot 100$$
(3)

is around 20%. Thus, in our specific complex, the presence of the allosteric ligand causes a speeding up of the energy exchange between distant parts of the protein. This might play a biologically relevant role, as it affects the communication among two distant biologically active sites. On the other hand, it might make the entire structure more “stable”, since any thermal perturbation can be accommodated faster among the degrees of freedom. This allows the protein to faster reach the thermal equilibrium. It should be noticed that this finding strictly holds for the case considered, although It is plausible (albeit not proved) that other GPCRs/allosteric ligands sharing a high degree of structural similarity with the one considered here might exhibit similar behavior binding an allosteric modulator belonging to the same class as LY211960. Conversely, what makes this study specific for the M2 receptor is the analysis of the residues involved in allosteric and orthosteric binding. This is carried out by calculating ΔFX for a specific residue X, as opposed to the overall ΔF value in Eq. (3). The question we will address is: Does a large value of ΔFX indicate an important role of the residue for ligands binding? Fig. 4 shows that the largest ΔFX values (>70%) are those for X = Y80, W422, Y177 and Y403. Interestingly, all of these residues are important for binding (see Fig. 5) (i) W422 and Y177 play a crucial role for the binding itself22. They form two pi-stacking bonds on two opposite sides of the ligand aromatic rings. Binding does not occur if they are mutated with a non-aromatic residue22. (ii) Y403, which interacts with W422 through a T-shaped stacking, is part of tyrosine lid that has been shown to be essential for the orthosteric binding23,24. (iii)Y80’s hydroxyl group forms a stable hydrogen bond with the allosteric ligand15, pointing to his role for binding. The latter is further confirmed by the change in magnitude of positive cooperativityα) upon mutation of the residue to alanine in the structurally similar M4 receptor22. aFootnote 1α is the ratio between the dissociation constant of the orthosteric ligand with and without the allosteric modulator: it can be considered as the relative difference in the Evib current and a measure of the relative loss/gain of stability upon binding of the allosteric modulator. Δα value is the largest so far among all those measured (Fig. SI1). The relevance these residues have, both for the Evib exchange and for the allosteric/orthosteric binding, makes them useful for being exploited for drug development. Noteworthy the interactions they make with the allosteric ligands modifies their dynamics and, thus, their property of exchanging Evib with other residues. Even a slightly change in those interactions might lead, consequently, to either a decrease or a negligible value of ΔF producing a different allosteric behavior.

Figure 4
figure 4

Relevant Residues for the allosteric Modulation. Side (a) and top (b) views of the receptor showing the residues considered in our analysis and the allosteric and orthosteric ligand. They are mostly aromatic residues either binding directly the allosteric modulator (Y80, Y83, I85, W89, W99, Y177, N410, W422) or the orthosteric (D103, Y104.Y403, Y426, Y430).

Figure 5
figure 5

Contribution to energy current. Energy current of considered residues difference between the allosteric and orthosteric ligand bound complexes. These figures have been created with VMD 1.9.3 software package39. (http://www.ks.uiuc.edu/Research/vmd/).

Conclusion

We have presented a molecular simulation study on the M2 receptor in complex with its agonist and the PAM LY211960. Our calculations show that the latter affects the dynamics and consequently the Evib exchange in the M2 receptor structure. Upon LY211960 binding, the energy between distant protein regions is speeded up. Thus, any kind of perturbation affecting the structure could be accommodated easier than in the case the ligand is absent. Residues involved in either allosteric or orthosteric binding, showed to affect the energy current change on passing from the binary to the ternary complex. The last finding may provide valuable information for further allosteric drug development. Indeed, these residues can be exploited to model new ligands targeting specifically them and effectively modifying the allosteric communication. These results might be generalized to other proteins accommodating PAMs.

Methods

Systems and calculations

The calculations are based on the mAchR iperoxo binary (PDBID 4MQS) and mAchR iperoxo LY211960 (PDBID: 4MQT) ternary complex X-ray structures. A long intracellular loop between residues 212 to 383, not present in the crystal structures, was added using the Robetta25 and MODELLER26 codes.

The proteins were inserted in a pre-equilibrated membrane, built using CHARMM-GUI membrane builder web server27. 80.000 water molecules were added. Na+ and Cl ions were also added so as to make the system neutral and to reach a physiological concentration (about 0.15 M). The overall systems consisted of about 120.000 atoms.

The AMBER99SB28 force field and TIP3P model29, Slipids30 were used for the protein, Na+ and Cl ions, water and lipids. The General Amber force field (GAFF) parameters31 were used for the ligands in both complexes, along with the RESP atomic charges32. These were obtained by fitting with the electrostatic potential (ESP) from Gaussian 0933 calculation with the HF-6-31G* basis set. The ligands topologies were converted to the GROMACS format using the ACPYPE tool34.

One simulation for each system was performed with the GROMACS program suite35. 500 ns were performed in the NPT ensemble using 1 fs time-step, sampling each 100 ps, after a 50 ns of thermalization run, in a 11 × 11 × 20 A simulation box. Nose-Hoover thermostat36 was used with τ = 0.4 ps along with Parrinello-Rahman barostat37 with τ = 1.0 ps. Electrostatic interactions were treated using PME method38. A cut-off of 1.2 nm was used for all the long-range interactions.

Robust perron cluster ananlysis

PCCA+21 aims to identify metastable states associated clusters to find a diagonal-block structure in the N by N transition frequency matrix L. To do so, an arbitrary number q of excepted clusters is defined. The “goodness” of this choice can be tested a posteriori using an estimator, θ.

Ck⊂{1, … N} is the set of indices (of the matrix L) of the k-th cluster, k = 1, … q.

Furthermore, the set of representative indices π(k) of the i-th cluster is defined so that a row tπ(k) of L is orthogonal to the characteristic vector of the cluster l (\({{\boldsymbol{\chi }}}_{l}\)):

$${{\boldsymbol{t}}}_{\pi (k)}\cdot {{\boldsymbol{\chi }}}_{{C}_{l}}={\delta }_{kl}$$
(4)

where δkl is the Kronecker delta. \({{\boldsymbol{\chi }}}_{{C}_{l}}\) is zero if the index of the entries does not belong to Cl. For a generic row tm the Eq. (4) becomes:

$${{\boldsymbol{t}}}_{m}\cdot {{\boldsymbol{\chi }}}_{{C}_{k}}={p}_{m,{C}_{k}}$$
(5)

where \({{p}}_{{m}{,}{{C}}_{{i}}}\) is the transition probability from the state m to the cluster Ci. Hence, combining Eqs. (4) and (5) we get:

$$({{\boldsymbol{t}}}_{m}-\mathop{\sum }\limits_{s=1}^{q}{p}_{m,{C}_{s}}{{\boldsymbol{t}}}_{\pi (s)})\cdot {{\boldsymbol{\chi }}}_{{C}_{k}}=0$$
(6)

This means the error one commits writing tm as a linear combination tπ(s) is orthogonal to every \({{\boldsymbol{\chi }}}_{{C}_{k}}\). In principle we don’t know \({{\boldsymbol{\chi }}}_{{C}_{k}}\) but we know that, if there is a hidden block-diagonal structure, the L’s eigenvector e(k) are almost constant on Ck. This means that we can approximate the Eq. (6) as:

$$({{\boldsymbol{t}}}_{m}-\mathop{\sum }\limits_{s=1}^{q}{w}_{m}^{(s)}\,{{\boldsymbol{t}}}_{\pi (s)})\cdot {{\boldsymbol{e}}}^{(k)}=0$$
(7)

where we have introduced \({w}_{m}^{(s)} \sim {p}_{m,{C}_{s}}\), which are defined as the grade of membership of the state m to the cluster s and they should range from 0 to 1. wm,s is the main outcome of PCCA+. In order to calculate them we can recast Eq. (7) as:

$${{\boldsymbol{e}}}_{m}^{(k)}=\mathop{\sum }\limits_{s=1}^{q}{w}_{m}^{(s)}{{\boldsymbol{e}}}_{\pi (s)}^{(k)}$$
(8)

Knowing π(s) and the eigenvectors e(k) the Eq. (8) can be inverted for finding \({w}_{m}^{(s)}\). e(k) are obtained by diagonalizing L and π(s) are chosen selecting those which maximize the distance ||erep|| where er(p) are vectors in \({{\mathbb{R}}}^{q}\) so that er(p) = (\({{\boldsymbol{e}}}_{r(p)}^{(1)},\ldots ,{{{\boldsymbol{e}}}_{r(p)}}^{(q)}\))21.

Finally, we have to discuss the chosen number of clusters. Ad stated above \({w}_{m}^{(s)} \sim {p}_{m,{C}_{s}}\), hence, \({w}_{m}^{(s)}\ge 0\). There is no guarantee the latter is fulfilled, therefore, the positiveness of \({w}_{m}^{(s)}\) could be used to evaluate if the number of chosen clusters q is suitable for the system under consideration or not. Thus, we define θ as:

$$\theta =mi{n}_{k,m}{w}_{m}^{(k)}$$
(9)

whereas \(\theta \ll 0\) the number of clusters is too large.

Transition path theory

Given two sets of residues A and B (Identified by PCCA+), we define a trajectory on the Markov chain as a sequence of residues over time. The probability that a trajectory starting from a residue i ∈(AB)c reaches first A then B is called forward committor probability \({q}_{i}^{+}\) and it solves the following set of equations:

$$\begin{array}{cc}\sum _{j}{{\boldsymbol{L}}}_{ij}{q}_{j}^{+}=0 & \forall \,i\in {(A\cup B)}^{c}\\ {q}_{i}^{+}=0 & \forall \,i\in A\\ {q}_{i}^{+}=1 & \forall \,{\rm{i}}\in B\end{array}$$
(10)

Conversely, we can define the backward committor probability \({q}_{i}^{-}\), which is the probability of reaching i first than B starting from A and, as in the case of \({q}_{i}^{+}\), the following equations hold :

$$\begin{array}{cc}\sum _{j}{{\boldsymbol{L}}}_{ij}{q}_{j}^{-}=0 & \forall \,i\in {(A\cup B)}^{c}\\ {q}_{i}^{-}=1 & \forall \,i\in A\\ {q}_{i}^{+}=0 & \forall \,i\in B\end{array}$$
(11)

Trajectories going from A to B are called reactive trajectories. The probability to find reactive trajectories, given i ∈ (AB)c is:

$${m}_{i}^{R}={p}_{i}^{\infty }{q}_{i}^{-}{q}_{i}^{+}$$
(12)

where \({p}_{\alpha }^{\infty }\) is the stationary probability distribution. Finally, we can define the probability current of reactive trajectories “flowing” from i to j, with i, j ∈ (AB)c and ij as:

$${f}_{ij}^{AB}={p}_{i}^{\infty }{q}_{j}^{-}{{\boldsymbol{L}}}_{ij}{q}_{j}^{+}$$
(13)

In conclusion, the probability current exiting from a cluster A and entering in B is :

$${F}^{AB}=\sum _{\alpha \in A}\sum _{\beta \notin A}{f}_{ij}^{AB}$$
(14)

This current is directly connected to the Evib current by a multiplicative constant. However, in this work we are interested in relative changes in the flux and not in their absolute values, thus, our results hold independently this constant.