Journal of Biomolecular NMR

, Volume 52, Issue 1, pp 23–30 | Cite as

An NMR-based scoring function improves the accuracy of binding pose predictions by docking by two orders of magnitude

  • Julien Orts
  • Stefan Bartoschek
  • Christian Griesinger
  • Peter Monecke
  • Teresa Carlomagno
Open Access
Article

Abstract

Low-affinity ligands can be efficiently optimized into high-affinity drug leads by structure based drug design when atomic-resolution structural information on the protein/ligand complexes is available. In this work we show that the use of a few, easily obtainable, experimental restraints improves the accuracy of the docking experiments by two orders of magnitude. The experimental data are measured in nuclear magnetic resonance spectra and consist of protein-mediated NOEs between two competitively binding ligands. The methodology can be widely applied as the data are readily obtained for low-affinity ligands in the presence of non-labelled receptor at low concentration. The experimental inter-ligand NOEs are efficiently used to filter and rank complex model structures that have been pre-selected by docking protocols. This approach dramatically reduces the degeneracy and inaccuracy of the chosen model in docking experiments, is robust with respect to inaccuracy of the structural model used to represent the free receptor and is suitable for high-throughput docking campaigns.

Keywords

NMR INPHARMA NOE Docking Drug design 

Introduction

Structure based drug design (SBDD) has evolved within the last decades to a powerful tool for the optimization of many low molecular weight lead compounds to highly potent drugs (Rees et al. 2004). The principle of SBDD lies in the combination of different chemical moieties with the aim of obtaining a molecule that, while possessing the pharmacological properties necessary for a drug, is complementary in shape to the receptor-binding pocket. This process requires knowledge of the exact structure of the receptor/ligand complex, which is usually obtained by X-ray crystallography.

In the absence of structural information for the complex, SBDD relies on the generation of plausible docking models. However, docking protocols suffer from inaccuracies in the description of the interaction energies between the ligand and the target molecule and often fail in the prediction of the correct interaction mode. This is particularly true when the docking experiments use low-definition or inaccurate target structures. Such limitation of the docking approach is serious when considering the increasing gap between the newly identified protein sequences and the availability of structural information (The UniProt Consortium 2008; Berman et al. 2009). While for proteins sharing more than 30% sequence identity to their homologous templates, computational methods provide models that are typically comparable to low-resolution experimental structures, when the sequence identity drops below 30%, the model accuracy decreases due to alignment errors (Kortagere and Ekins 2010; Katritch et al. 2010; Rai et al. 2010). These problems call for the need of experimental data that could improve the performance of the docking scoring functions, while not requiring the difficult step of obtaining high-resolution structural information for the target.

In recent years, nuclear magnetic resonance (NMR) spectroscopy has taken an important role in the detection and structural characterization of low-affinity (micromolar to millimolar range) ligands that can be developed into high-affinity leads by SBDD (Pochapsky and Pochapsky 2001; Wyss et al. 2002; Van Dongen et al. 2002; Pellecchia et al. 2002). Transferred-NOEs (Ni and Scheraga 1994) and transferred-cross correlated relaxation rates (Carlomagno et al. 1999, 2003) provide the bioactive conformation of the ligand, while saturation transfer difference (STD) experiments (Mayer and Meyer 2001) reveal the ligand epitope. These approaches have the advantage of observing only the resonances of the ligands and of being applicable to protein targets of any size. Recently, we have developed an NMR-based methodology, INPHARMA (Interligand NOEs for PHARmacophore MApping) (Sanchez-Pedregal et al. 2005; Reese et al. 2007; Orts et al. 2008), which is able to reveal the relative, and in favourable cases even the absolute, binding mode of competitively binding, low-affinity ligands, with the sole requirement of a structural model of the apo-receptor. The relative binding mode of two ligands interacting competitively with a common receptor allows pharmacophore or ligand superimposition. This is an essential step in SBDD, guiding the synthetic combination of smaller ligands into a larger, higher affinity compound. The absolute binding mode of ligands to the receptor represents a higher level of knowledge that allows optimization of receptor/ligand interactions at an atomic level.

To demonstrate the efficacy of INPHARMA we had validated the methodology for a system consisting of the protein kinase A (PKA) with two inhibitors of catalysis (Fig. S1), for which the binding modes determined by INPHARMA could be compared to existing crystal structures (Orts et al. 2008). This test established the value of INPHARMA, and confirmed that the combination of tr-NOEs and INPHARMA NOEs, in the presence of a structural model for the apo-potein, allows discriminating between a few, very diverse docking modes (Orts et al. 2008).

Here, we demonstrate the use of INPHARMA data as a high-throughput scoring function for binding modes predicted by molecular docking. We show that INPHARMA allows a two-order of magnitude increase in accuracy with respect to state-of-the-art docking scoring functions and provides ligand binding modes at high resolution (up to less than 1 Å). In addition, we show that INPHARMA is applicable also to receptors whose apo-form is not a good representative of the holo-form or to receptors without an accurate structural representation. The easy availability of experimental INPHARMA data for target proteins of any size and nature make INPHARMA a tool of choice to increase the reliability of docking models and to substantially speed up the process of structure-based drug design.

Experimental procedures

INPHARMA data

The INPHARMA data were measured on a mixture of the Chinese hamster Ca catalytic subunit of cyclic adenosine monophosphate (cAMP) dependentent protein kinase A (PKA) (25–30 uM), L1 (150 uM) and L2 (450 uM), as described in (Orts et al. 2008). The values of the measured INPHARMA NOEs used in this work are in Table S1. NOESY spectra were recorded at two mixing times (τm = 300 and 600 ms) on an 800 MHz spectrometer (Bruker, Karlsruhe). One NOESY spectrum was recorded at a mixing time τm = 600 ms at a 900 MHz spectrometer (Bruker, Karlsruhe).

Molecular dynamics simulations

Molecular dynamics (MD) simulation were performed for the free PKA starting from the crystal structure of 3DNE.pdb after removal of L1 with the software NAMD (Phillips et al. 2005) and the CHARMM force field (MacKerell et al. 1988). A 5 Å layer of water molecules hydrated the protein PKA. The water sphere was maintained with a spherical harmonic potential. Langevin dynamics was performed with a 2 fs time step using the SHAKE algorithm, without coupling the hydrogens to the thermal bath and with a damping coefficient g of 5 per picosecond. First, 30.000 steps of energy minimization were performed at 0 K using a conjugate gradient and the line-search algorithm as described in the NAMD manual. In order to achieve a larger sampling of the conformational space of the protein, we increased the temperature from an initial value of 0 K to a final value of 1,200 K. Every, 30.000 steps the temperature increased by 50 K. Each final structure was minimized at a temperature of 0 K in 30.000 steps.

Docking

PLANTS docking (Korb et al. 2009) was performed using the ChemPLP scoring function and default parameters. The crystal structures of PKA/L1 and PKA/L2 were aligned on the protein. A spherical definition of the binding pocket centered at the center of mass of L1 and a radius of 11.2 Å was used to restrict the sampling space. After pose clustering using a threshold of 0.5 Å, the best-ranked 200 poses were used for further evaluations.

SURFLEX utilizes an idealized biding-site ligand as a target to generate conjectural poses of molecules. The idealized binding-site ligand is calculated for each of the 700 protein structures generated by MD simulation and minimized. Ligands are docked into the protein to optimize the value of the Hammerhead scoring function. For the analysis, we select the 10 best scoring poses for each protein structure. A similarity filter of 0.5 Å is applied for poses docked into the same protein structure, resulting in a final set of 4,636 and 4,758 unique complex structures for PKA/L1 and PKA/L2 respectively.

GLIDE requires preparing each proteins target with the “preparation wizard” option. For each prepared protein, we generated a grid around the binding site that was used as the docking target. A simple precision docking run was performed for each of the 700 protein structures generated by MD simulation, producing 5 poses per protein structure per ligand (PKA/L1 and PKA/L2). As for the SURFLEX docking, a similarity filter of 0.5 Å (integrated in GLIDE) deleted redundant ligand poses within the same protein structure. This procedure resulted in a final set of 2,697 and 3,069 unique complex structures for PKA/L1 and PKA/L2, respectively.

Calculation of INPHARMA

The theoretical INPHARMA NOEs for the complex pairs generated by docking were calculated with a program written in-house following the theory developed in (Reese et al. 2007; Orts et al. 2009). Protons within 8 Å from any ligand proton were included in the full relaxation matrix calculation. For the docking of Fig. 1, 40,000 complex pairs were calculated; for the docking with SURFLEX in Fig. 2 the dataset comprised 1,095,085, 2,694,315, 1,159,884, 70,448, 234,060 complex pairs for proteins with binding pocket RMSD in the range 0–1, 1–2, 2–3, 3–4, 4–5 Å from the crystal structure, respectively, of which 8,331, 16,770, 3,296, 181 and 412 correspond to the correct relative orientation of the ligands in each protein RMSD range.
Fig. 1

a Initial pool of docked structures for the complexes PKA/L1 and PKA/L2. The receptor model used in the docking is the PKA structure of 3DNE.pdb. The complex pairs in b pass the selection through the INPHARMA data (Pearson correlation coefficient R2 between the experimental and the theoretical INPHARMA NOEs > 0.89). All these complex pairs show a very low ligand RMSD from the correct binding mode. c Overlap of L1 and L2 in the complex pairs of b, after superimposition of the protein structures. The INPHARMA data define the orientation of L1 and L2 correctly to 0.5 and 1 Å resolution, respectively

Fig. 2

Accuracy of the INPHARMA predictions as a function of the quality of the receptor structure. The x-axis represents the (protein only) binding pocket RMSD of the receptor models used in the docking from the crystallographic structure of PKA in the complex PKA/L1 (3DNE.pdb). The accuracy on the y axis is defined as the number of complex pairs reproducing the correct ligands superposition (relative binding mode of L1 and L2) divided by the total number of pairs selected by INPHARMA. The numbers over each bar in red represent the accuracy before applying the INPHARMA score. In this case the accuracy is the number of the complex pairs showing the correct ligands superposition divided by the total number of complex pairs selected by the energy function of the docking program. The docking for this dataset was performed with SURFLEX

The ranking of the complex pairs was based on the centered Pearson correlation coefficient between the measured and the predicted INPHARMA NOEs. Structures were accepted when the Pearson correlation coefficient R2 was higher than 0.89 for the data of Fig. 1 and 0.72 for the data of Fig. 2 and Fig. S4. An additional filter was applied based on the qualitative agreement of very weak INPHARMA NOEs, which were visible only at high-fields due to the better sensitivity of the instrumentation. For the docking of Fig. 2, 107, 208, 189, 26, 23 complex pairs with proteins in an RMSD range of 0–1, 1–2, 2–3, 3–4, 4–5 Å from the crystal structure, respectively, passed the selection, of which 98, 63, 60, 5 and 7 correspond to the correct relative orientation of the ligands in each protein RMSD range.

For the docking with GLIDE in Fig. S4, the dataset comprised 452,760, 908,974, 575,320, 36,864, 73,060 complex pairs for proteins with binding pocket RMSD in the range 0–1, 1–2, 2–3, 3–4, 4–5 Å from the crystal structure, respectively, of which 13,560, 8,694, 4,184, 160 and 166 correspond to the correct relative orientation of the ligands in each protein RMSD range. Moreover, 2, 485, 585, 35, 39 complex pairs with proteins in an RMSD range of 0–1, 1–2, 2–3, 3–4, 4–5 Å from the crystal structure, respectively, passed the selection, of which 2, 114, 103, 19 and 9 correspond to the correct relative orientation of the ligands in each protein RMSD range.

Results and discussion

The INPHARMA method is based on the observation of interligand, spin diffusion mediated, transferred-NOE data, between two ligands L1 and L2, binding competitively and weakly to a receptor T (Fig. S2). As the ligands are competitive binders, such NOEs do not originate from a direct transfer of magnetization between the two ligands, but rather from a spin-diffusion process mediated by the protons of the receptor binding pocket and are, therefore, dependent on the specific interactions of each of the two ligands with the protein (Sanchez-Pedregal et al. 2005). In line with common SBDD worflows, the INPHARMA NOEs are used to select among possible complex structures suggested by molecular docking. The bound ligand structures, which can be determined by tr-NOEs, are docked to a structural model of the apo-receptor. A library consisting of pairs of complex structures (receptor/L1 and receptor/L2) is generated by combining all docking modes of L1 to the receptor with all docking modes of L2 to the receptor. The resulting docking models pairs are ranked on the basis of the agreement between the predicted and the experimental INPHARMA NOEs (Reese et al. 2007).

Previously, we demonstrated that INPHARMA is able to determine the binding mode of the two ligands L1 and L2 to the catalytic subunit of PKA (Orts et al. 2008). In this work we aim at establishing INPHARMA as an effective scoring function for binding modes in high-throughput docking campaigns. First, we evaluate the ability of INPHARMA to provide high-resolution binding modes when ligands are docked to a correctly folded binding pocket; second, we evaluate the efficacy of the methodology in dependence of the accuracy of the protein structure used in the docking experiments. We prove that the use of experimental INPHARMA data to score binding modes generated in silico provides a considerable improvement in the accuracy of the selection of the correct binding pose, even when using a poor representation of the protein binding pocket. As a test system, we use the two ligands L1 and L2 bound to the catalytic subunit of the protein PKA, for which experimental data have been measured in the laboratory as described in the Experimental Section. L1 and L2 bind PKA with KDs of 6 and 16 uM, respectively and are therefore suitable to measure both transferred-NOEs and INPHARMA NOEs. The crystal structures of the complexes PKA/L1 and PKA/L2 (3DNE.pdb and 3DND.pdb, respectively) serve as benchmark to evaluate the performance of INPHARMA.

INPHARMA allows the definition of binding modes to 1 Å resolution

The bound structures of L1 and L2, which can be determined by transferred-NOEs, are docked into the structure of the catalytic subunit of PKA from 3DNE.pdb after removal of the ligand. The PKA structure of 3DND.pdb could have been used instead, as the protein heavy atom RMSD (root mean square deviation) in the two complexes is only 0.28 Å. 200 docking modes are generated per ligand with the program PLANTS (Korb et al. 2009) and combined pair-wise to give 40,000 pairs of complex structures of PKA/L1 and PK/L2. Each pair of this library is represented in Fig. 1 in terms of the RMSD of each ligand from the true binding mode, as observed in the crystal structures of the PKA/L1 and PKA/L2 complexes (3DNE.pdb and 3DND.pdb). The initial library of docking modes contains complex structures pairs where both ligands are in the correct orientation (lower left corner), both ligands are in the wrong orientation (higher right corner) or only one ligand is in the correct orientation (lower right and higher left corners). Next we ranked the 40,000 structure pairs with respect to the agreement between the theoretical, predicted INPHARMA NOEs for each particular structures pair and the experimentally measured INPHARMA NOEs of Table S1. The purpose of this analysis is to verify whether INPHARMA data can be used to select the correct binding modes of L1 and L2 and to determine the maximum achievable resolution of the resulting complex structures. We use the linear correlation coefficient R2 to describe the agreement between experimental and theoretical INPHARMA data; pairs of complex structures with R2 > 0.89 are accepted. Indeed, the structures selected by INPHARMA (Fig. 1b) are those of the lower left corner of the graph of Fig. 1a, namely close to the correct binding poses for both ligands. A closer analysis of the INPHARMA-selected structures reveals that they correspond to only one orientation per ligand, with L1 and L2 being defined to a precision higher than 0.5 and 1 Å, respectively (Fig. 1c). The maximum distance between two INPHARMA selected structures is between the orange and the yellow binding mode of L2 (Fig. 1c) and corresponds to a rotation of 21° around the axis perpendicular to the figure plane. This result highlights an impressive performance of INPHARMA, which distinguishes even between closely related binding modes at a high level of resolution (~1 Å). The receptor model used in the docking can be derived either from the structure of the apo-receptor or from the structure of the receptor in complex with a reference ligand Lx. In the absence of conformational rearrangements between the apo- and the holo-receptor, or between the receptor/Lx and the receptor/L1 (receptor/L2) complexes, the absolute binding mode of any ligand (L1….Ln) can be derived at a high confidence level from INPHARMA data measured for pair-wise combinations of ligands (e.g. L1 and L2).

INPHARMA alleviates the need of crystallizing the receptor in complex with all chemical lead series of interests, overcoming an important limiting factor in the daily work of pharmaceutical industry. The binding modes of all chemical series of interest are within reach through the measurement of a few INPHARMA NOESY spectra and the employment of the INPHARMA NOEs as a reliable selection criterion for docking modes. The NMR time necessary to acquire data for a ligands pair amounts to only 2 days, while the calculation time is less than 1 day for 40,000 pairs of docking models.

INPHARMA allows a 100-fold improvement with respect to docking scoring functions

Despite the enormous potential of INPHARMA demonstrated in the previous section, the pharmaceutical research often faces more challenging cases, where either the structure of the receptor is not known at a high level of accuracy or the receptor undergoes substantial conformational changes between the apo- and holo-forms (Bartoschek et al. 2010). In this section the performance of INPHARMA as energy function to rank docking modes generated from an ill-defined protein structure is systematically tested and compared with the performance of state-of-the-art docking scoring functions.

As a test system we use the protein PKA in complex with L1 and L2 (Fig. S1). Structures of PKA that differ from the ligand-bound structure were generated by a high-temperature molecular dynamic simulation run starting from the crystal structure of 3DNE.pdb after removal of L1. 700 frames were sampled during the simulation, resulting in structures that display 0.5–6 Å heavy atom RMSD in the binding pocket from the ligand-bound structure (Fig. S3). In our definition the binding pocket comprises all atoms with distance <8 Å from any ligand atom in the crystal structure. All frames were subject to energy minimization in explicit water. This initial library of PKA models contains structures in a wide range of distances from the correct one and is therefore optimally suited to evaluate the performance of INPHARMA in dependence of the accuracy of the protein structural model.

Next we docked the protein-bound conformation of L1 and L2 to each of the 700 structural models of the protein PKA. We used the rigid docking module of the commercially available software SURFLEX (Jain 2003) and retained the 10 best energy solutions for each docking run. A filter based on similarity was applied to exclude redundant binding modes for the same protein model. Complex structures with the same protein model and with ligand all-atom RMSD < 0.5 Å were represented by one member of the family. Note that similar ligand binding poses in two different protein models are considered non-redundant and are retained. The final set of complexes consists of 4,636 and 4,758 poses for PKA/L1 and PKA/L2, respectively.

The 4,636 and 4,758 structures for the PKA/L1 and PKA/L2 complexes, respectively, have been selected by the docking scoring function as the lowest energy ones and represent the docking solution to the problem. At this point it is interesting to evaluate which percentage of the docking models predicts the correct relative orientation of the two ligands, in dependence of the accuracy of the receptor model used in the docking. This is highly relevant to SBDD as a correct ligands superposition is the first necessary step to the structure-guided, synthetic combination of lead compounds.

To this purpose, ligand binding modes of L1 and L2 were combined pair-wise for all protein models inside a certain range of RMSD (0–1; 1–2; 2–3; 3–4; 4–5 Å) from the correct protein structure (3DNE.pdb). This resulted in 1,095,085, 2,694,315, 1,159,884, 70,448, 234,060 complex pairs in each of the five RMSD ranges, respectively. To evaluate the similarity of the relative orientation of the ligands in each models pair to the correct relative orientation, as observed in the crystal structures of PKA/L1 and PKA/L2, we used quaternions, which describe objects rotations with respect to a reference frame. The exact procedure is explained in the Supporting Information. The relative binding mode of L1 and L2 to PKA was considered correct when the quaternions defining the rotations of L1 and L2 in the docking models with respect to the crystallographic structures of PKA/L1 and PKA/L2 satisfied the conditions of Eq. S1 and S2, namely the two quaternions were sufficiently similar. The accuracy was defined as the ratio between the number of correct pairs of complex structures (in terms of relative ligand orientation) over the total number of structural pairs.

The red numbers in Fig. 2 summarizes the results. The accuracy of predicting the correct superimposition of the ligands by the docking scoring function is rather poor and reaches at best 1% for ligands docked to the correct protein structure (receptor model RMSD < 1 Å). This low number is not surprising, as in general an average accuracy of only 5% is assumed for docking calculations with one ligand (the correct binding pose is found in the 20 lowest energy structures) (Davis and Baker 2009; Davis et al. 2009). To verify that this result is not only dependent on the docking program used, we repeated the docking exercise with GLIDE (Schroedinger 2003) and obtained very similar results (Fig. S4). The poor performance of the docking reflects the difficulty of both SURFLEX and GLIDE to find the correct binding pose for L1 (Fig. S5). In general, buried binding pockets, like the ATP binding pocket in PKA, are unfavorable cases for in silico methods (Davis et al. 2009). The limited success of two of the most popular docking programs in this case calls for the need of an additional, experiment-based scoring function.

Following this reasoning, we evaluated the performance of the experimental INPHARMA data for ranking and filtering the pairs of docking modes. Complex structure pairs for which the correlation coefficient R2 between the predicted and the experimental INPHARMA NOEs was lower than 0.72 are rejected. The threshold for R2 is set lower than described in the previous section to account for the fact that the use of an ill-defined protein model to generate complex structures affects the quality of the fit of the theoretical to the experimental INPHARMA NOEs even for correct binding modes. The accuracy of the ligand superimposition after filtering the complex pairs with respect to the INPHARMA score is given in Fig. 2 and Fig. S4 (for docking performed with SURFLEX and GLIDE, respectively). In contrast to the docking scoring function, the additional filtering through the INPHARMA score achieves an accuracy >90% for reasonably well-defined protein models (PKA RMSD ≤ 1 Å). For less well-defined protein models (PKA RMSD > 1 Å) the accuracy drops to 30% but remains constant for the complete protein RMSD range (up to 5 Å) (Fig. 2). The weak dependence of the accuracy from the protein structure quality in a wide range of RMSD (1–5 Å) indicates that INPHARMA is a solid scoring function for docking modes calculated using a poor representation of the protein target structure, such as a homology model. The gain in accuracy provided by INPHARMA with respect to the energy function of the in silico docking reaches two orders of magnitude through-out the whole range of protein structures, underlining the enormous advantage of using these few, easily accessible experimental data for docking scoring and validation.

The criterion based on quaternions (Eq. S1 and S2) to evaluate the correctness of the relative binding mode of L1 and L2 is far more restrictive than the common requirements of SBDD workflows. In SBDD, a ligand is considered to be in the correct orientation when its RMSD from the true binding mode is less than 2 Å. A similar criterion can be applied to evaluate the correctness of the ligands superposition in the INPHARMA-selected complex pairs. If the RMSDs between both L1 and L2 in the INPHARMA-selected pairs and both L1 and L2 in the crystal structures of PKA/L1 and PKA/L2, after superimposition of the ligands, is less than 2 Å, the ligands superposition is considered to be correct. With this measure, the accuracy of the INPHARMA selection reaches 75% for all protein structures (Fig. 3).
Fig. 3

Representation of the complex pairs PKA/L1 and PKA/L2 selected by INPHARMA in dependence of the RMSD of PKA from the coordinates of the crystal structure 3DNE.pdb used as initial model for the docking (a RMSD < 1 Å; b RMSD < 2 Å; c RMSD < 3 Å; d RMSD < 5 Å). The numbers and the color code of each circle slice represent the ligands RMSD (L1, L2) from the coordinates of the ligands in the crystal structures of PKA/L1 and PKA/L2, after superimposition of the ligands. When the initial protein model is well-defined (PKA RMSD < 1 Å), INPHARMA selects complex pairs that allow ligands superposition at a resolution better than 1.5 Å. Usually, in structural based drug design, a resolution of 2 Å in the ligand coordinates is considered to be acceptable. In this limit the accuracy of the ligands superposition in the INPHARMA selected binding modes is always higher than 75%, even for ill-defined protein models (panel c and d)

Evaluation of the accuracy of INPHARMA in determining ligands superposition when the binding mode of one of the ligands is known

Frequently in SBDD campaigns, the receptor protein can be crystallized with some but not all lead series of interest. In these cases, the process of finding the correct ligands superposition can be aided by the availability of the absolute binding mode of a reference ligand. To test the performance of INPHARMA in this scenario we repeated the analysis of Fig. 2 on complex pairs for which either the PKA/L1 (Fig. 4a) or the PKA/L2 (Fig. 4b) structure was fixed to the correct one, as seen in 3DNE.pdb or 3DND.pdb. The correctness of the orientation of the non-fixed ligand was evaluated by applying the criteria of Eq. S2 on the quaternion of this ligand. In this scenario the performance of INPHARMA is excellent: when L1 is fixed and the orientation of the L2 is unknown, selection of docking modes by INPHARMA reaches an accuracy of 100% in all cases. It is worth noticing that for ill-defined protein models, no complex structure passes the INPHARMA selection, indicating that in this case the INPHARMA data provide information also on the protein structure. When L2 is fixed and the orientation of L1 is searched, the performance of INPHARMA is slightly worse; nevertheless an improvement in accuracy of more than two orders of magnitudes is achieved with respect to the energy function of the in silico docking, which, as discussed above, performs particularly bad in predicting the binding mode of L1. Also in this case INPHARMA finds no solution for ill-defined protein models (RMSD > 3 Å), thereby restricting also the protein conformation.
Fig. 4

Accuracy of the INPHARMA predictions as a function of the quality of the receptor structure when the binding mode of one of the two ligands is known and used as reference. The x-axis represents the (protein only) binding pocket RMSD of the receptor models used in the docking from the crystallographic structure of PKA in the complex PKA/L1 (3DNE.pdb). The accuracy on the y axis is defined as the number of complex pairs reproducing the correct ligands superposition (absolute binding mode of L2, using the PKA/L1 complex as reference in (a); absolute binding mode of L1, using the PKA/L2 complex as reference in (b) divided by the total number of pairs selected by INPHARMA. The numbers over each bar in red represent the accuracy before applying the INPHARMA score. In this case the accuracy is the number of the complex pairs showing the correct ligands superposition divided by the total number of complex pairs selected by the energy function of the docking program. The docking for this dataset was performed with SURFLEX

Conclusions

The INPHARMA method allows closing a gap in structure based drug discovery by providing information at atomic resolution on the receptor/ligand interactions for complexes that cannot be crystallized. When an accurate representation of the bound structure of the receptor is used for docking, INPHARMA experimental data allow selection of the correct ligand binding pose to a resolution better than 1 Å. The success rate of INPHARMA decreases for docking models obtained with an inaccurate structure of the receptor; however, independently of the quality of the receptor structure, the performance of the INPHARMA-based ranking exceeds by 100-fold that of the scoring function of state-of-the-art docking programs. INPHARMA data are easy to measure and require no isotope labeling scheme either for the receptor or for the ligands. All these factors encourage the implementation of INPHARMA experimental data as a routine scoring function to select complex models for weakly binding ligands.

Notes

Acknowledgments

This work was supported by the EMBL and by grant I 83-545 of the Volkswagen Stiftung. J. O. thanks Benjamin Stauch and Frank Thommen for support in software installation.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Supplementary material

10858_2011_9590_MOESM1_ESM.pdf (2.2 mb)
Supporting Text containing a detailed description of the quaternion-based criteria; Fig. S1 showing the crystal structures of the PKA/L1 and PKA/L2 complexes; Fig. S2 describing schematically the INPHARMA methodology; Fig. S3 representing the PKA structures along the MD trajectory; Fig. S4, containing the same information as Fig. 2 but for docking with the program GLIDE; Fig. S5 showing the accuracy of the docking results; Table S1 containing the experimental INPHARMA data. (PDF 2204 kb)

References

  1. Bartoschek S, Klabunde T, Defossa E, Dietrich V, Stengelin S, Griesinger C, Carlomagno T, Focken I, Wendt KU (2010) Drug design for G-protein-coupled receptors by a ligand-based NMR method. Angew Chem Int Ed 49(8):1426–1429. doi:10.1002/anie.200905102 CrossRefGoogle Scholar
  2. Berman HM, Westbrook JD, Gabanyi MJ, Tao W, Shah R, Kouranov A, Schwede T, Arnold K, Kiefer F, Bordoli L, Kopp J, Podvinec M, Adams PD, Carter LG, Minor W, Nair R, La Baer J (2009) The protein structure initiative structural genomics knowledgebase. Nucleic Acids Res 37:D365–D368CrossRefGoogle Scholar
  3. Carlomagno T, Felli I, Czech M, Fischer R, Sprinzl M, Griesinger C (1999) Transferred cross-correlated relaxation: application to the determination of sugar pucker in an aminoacylated tRNA-mimetic weakly bound to EF-Tu. J Am Chem Soc 121:1945–1948CrossRefGoogle Scholar
  4. Carlomagno T, Sanchez V, Blommers M, Griesinger C (2003) Derivation of dihedral angles from CH–CH dipolar–dipolar cross-correlated relaxation rates: a C–C torsion involving a quaternary carbon atom in epothilone A bound to tubulin. Angew Chem Int Ed Engl 42:2515–2517CrossRefGoogle Scholar
  5. Davis IW, Baker D (2009) ROSETTALIGAND docking with full ligand and receptor flexibility. J Mol Biol 385(2):381–392. doi:10.1016/j.jmb.2008.11.010 CrossRefGoogle Scholar
  6. Davis IW, Raha K, Head MS, Baker D (2009) Blind docking of pharmaceutically relevant compounds using RosettaLigand. Protein Sci 18:1999–2002CrossRefGoogle Scholar
  7. Jain AN (2003) Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. J Med Chem 46:499–511CrossRefGoogle Scholar
  8. Katritch V, Rueda M, Lam PC, Yeager M, Abagyan R (2010) GPCR 3D homology models for ligand screening: lessons learned from blind predictions of adenosine A2a receptor complex. Proteins 78:197–211CrossRefGoogle Scholar
  9. Korb O, Stützle T, Exner TE (2009) Empirical scoring functions for advanced protein-ligand docking with PLANTS. J Chem Inf Model 49:84–96CrossRefGoogle Scholar
  10. Kortagere S, Ekins S (2010) Troubleshooting computational methods in drug discovery. J Pharmacol Toxicol Methods 61:67–75CrossRefGoogle Scholar
  11. MacKerell AD Jr, Bashford D, Bellott M, Dunbrack RL Jr, Evanseck J, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher IWE, Roux B, Schlenkrich M, Smith J, Stote R, Straub J, Watanabe M, Wiorkiewicz–Kuczera J, Yin D, Karplus M (1988) All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 102:3586–3616CrossRefGoogle Scholar
  12. Mayer M, Meyer B (2001) Group epitope mapping by saturation transfer difference NMR to identify segments of a ligand in direct contact with a protein receptor. J Am Chem Soc 123:6108–6117CrossRefGoogle Scholar
  13. Ni F, Scheraga HA (1994) Use of the transferred nuclear overhauser effect to determine the conformations of ligands bound to proteins. Acc Chem Res 27(9):257–264CrossRefGoogle Scholar
  14. Orts J, Tuma J, Reese M, Grimm SK, Monecke P, Bartoschek S, Schiffer A, Wendt KU, Griesinger C, Carlomagno T (2008) Crystallography-independent determination of ligand binding modes. Angew Chem Int Ed Engl 47:7736–7740CrossRefGoogle Scholar
  15. Orts J, Griesinger C, Carlomagno T (2009) The INPHARMA technique for pharmacophore mapping: a theoretical guide to the method. J Magn Res 200(1):64–73ADSCrossRefGoogle Scholar
  16. Pellecchia M, Sam D, W¨uthrich K (2002) Nmr in drug discovery. Nat Rev Drug Discov 1:211–219CrossRefGoogle Scholar
  17. Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kalé L, Schulten K (2005) Scalable molecular dynamics with NAMD. J Comput Chem 26:1781–1802CrossRefGoogle Scholar
  18. Pochapsky S, Pochapsky T (2001) Nuclear magnetic resonance as a tool in drug discovery, metabolism and disposition. Curr Top Med Chem 1:427–441CrossRefGoogle Scholar
  19. Rai B, Tawa G, Katz A, Humblet C (2010) Modeling G protein-coupled receptors for structure-based drug discovery using low-frequency normal modes for refinement of homology models: application to H3 antagonists. Proteins 78:457–473CrossRefGoogle Scholar
  20. Rees DC, Congreve M, Murray CW, Carr R (2004) Fragment-based lead discovery. Nat Rev Drug Discov 3(8):660–672CrossRefGoogle Scholar
  21. Reese M, Sanchez-Pedregal VM, Kubicek K, Meiler J, Blommers MJJ, Griesinger C, Carlomagno T (2007) Structural basis of the activity of the microtubule-stabilizing agent epothilone A studied by NMR spectroscopy in solution. Angew Chem Int Ed Engl 46(11):1864–1868CrossRefGoogle Scholar
  22. Sanchez-Pedregal VM, Reese M, Meiler J, Blommers MJ, Griesinger C, Carlomagno T (2005) The INPHARMA method: protein-mediated interligand NOEs for pharmacophore mapping. Angew Chem Int Ed Engl 44(27):4172–4175CrossRefGoogle Scholar
  23. Schroedinger LLC (2003) The Glide 2.5 calculations used FirstDiscovery, version 2.5021. New YorkGoogle Scholar
  24. The UniProt Consortium (2008) The universal protein resource (UniProt). Nucleic Acids Res 36:D190–D195Google Scholar
  25. Van Dongen M, Weigelt J, Uppenberg J, Schultz J, Wikstr¨om M (2002) Structurebased screening and design in drug discovery. Drug Discov Today 7:471–478CrossRefGoogle Scholar
  26. Wyss D, McCoy M, Senior M (2002) NMR-based approaches for lead discovery. Curr Opin Drug Discov Dev 5:630–647Google Scholar

Copyright information

© The Author(s) 2011

Authors and Affiliations

  • Julien Orts
    • 1
  • Stefan Bartoschek
    • 2
  • Christian Griesinger
    • 3
  • Peter Monecke
    • 4
  • Teresa Carlomagno
    • 1
  1. 1.EMBL, Structure and Computational Biology UnitHeidelbergGermany
  2. 2.Sanofi-Aventis Deutschland GmbH, R&D LGCR/Parallel Synthesis & Natural ProductsIndustriepark HoechstFrankfurt am MainGermany
  3. 3.Max Planck Institute for Biophysical ChemistryGöttingenGermany
  4. 4.Sanofi-Aventis Deutschland GmbH, R&D LGCR/Structure, Design & InformaticsIndustriepark HoechstFrankfurt am MainGermany

Personalised recommendations