Pharmacophore-Based Virtual Screening

  • Dragos Horvath
Part of the Methods in Molecular Biology book series (MIMB, volume 672)


This chapter is a review of the most recent developments in the field of pharmacophore modeling, covering both methodology and application. Pharmacophore-based virtual screening is nowadays a mature technology, very well accepted in the medicinal chemistry laboratory. Nevertheless, like any empirical approach, it has specific limitations and efforts to improve the methodology are still ongoing. Fundamentally, the core idea of “stripping” functional groups of their actual chemical nature in order to classify them into very few pharmacophore types, according to their dominant physico-chemical features, is both the main advantage and the main drawback of pharmacophore modeling. The advantage is the one of simplicity – the complex nature of noncovalent ligand binding interactions is rendered intuitive and comprehensible by the human mind. Although computers are much better suited for comparisons of pharmacophore patterns, a chemist’s intuition is primarily scaffold-oriented. Its underlying simplifications render pharmacophore modeling unable to provide perfect predictions of ligand binding propensities – not even if all its subsisting technical problems would be solved. Each step in pharmacophore modeling and exploitation has specific drawbacks: from insufficient or inaccurate conformational sampling to ambiguities in pharmacophore typing (mainly due to uncertainty regarding the tautomeric/protonation status of compounds), to computer time limitations in complex molecular overlay calculations, and to the choice of inappropriate anchoring points in active sites when ligand cocrystals structures are not available. Yet, imperfections notwithstanding, the approach is accurate enough in order to be practically useful and actually is the most used virtual screening technique in medicinal chemistry – notably for “scaffold hopping” approaches, allowing the discovery of new chemical classes carriers of a desired biological activity.

Key words

Pharmacophores Ligand-based design Structure-based design Molecular overlay Machine learning Virtual screening Conformational sampling 

1 Introduction

Pharmacophores are defined [1] as “the ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target structure and to trigger (or to block) its biological response.” They represent a conceptual model aimed at describing structure-binding affinity relationships by means of a simple set of approximate rules-of-the-thumb. Chemistry cannot be, in practice, reduced to physics. Systems of very few atoms and, more important, very few degrees of freedom can be studied by means of rigorous quantum calculations, but, eventually, their results have to be “translated” back into chemical language – electrophilic/nucleophilic attacks, steric hindrance, etc. The human mind is notoriously unable to deal with particle wave functions.

Pharmacophores are, in a broad sense, the mental models and paradigms that form the basis of noncovalent chemistry. After understanding the three-dimensional nature of molecules and of the stereochemistry rules determining the preferred conformations, ligand binding to macromolecules was explained by the (oversimplified) key-and-lock paradigm [2] of shape complementarity. The nature of the noncovalent binding “forces” – electrostatic, hydrogen bonding, and dispersive contributions, including solvation/hydrophobic effects [3] – is however prohibitively complex. The reason for quoting the word “forces” above is that these are not fundamental (of which four are thought to govern our Universe – the nuclear/strong, the weak, boson-mediated, the electromagnetic and gravity). Steric “repulsion forces” are just a useful mental model to rationalize and “wrap up” the behavior of the electron clouds in interacting atom spheres. These are anything but “spheres” – yet, we tend (we have no choice but?) to think about them as such, and therefore we need to introduce a van der Waals corrective term, an extra hypothesis that is not required if the “sphere” model is dropped in favor of high level ab initio calculations. Much of modern chemistry happens at this “atom sphere” level of approximation, so pharmacophore modeling is certain to occupy a privileged position in the hearts and minds of medicinal chemists. The principle of functional group complementarity (cations interact favorably with anions, donors with acceptors, and hydrophobes among themselves) is an essential paradigm in modern medicinal chemistry.

Unsurprisingly, a query by the “pharmacophore” term in the Web of Knowledge [4] database returns several hundreds of citations per year. Simulation-based affinity predictions – flexible docking [5] or free energy perturbation simulations [6] – are typically too time-consuming to be of large-scale practical use (even though they are as well based on severe approximations of the physical reality, using empirical force field [7, 8] energy calculations).

Like in force field calculations, the first step in pharmacophore modeling is atom typing – classification of the atoms in terms of their nature and chemical environment, into predefined categories associated to a specific interaction behavior. Force field fitting is merely a much finer classification scheme, leading to force field “types” associated to specific parameters describing the expected intensities of interaction. Pharmacophore typing typically does not go beyond a gross physico-chemical classification into “hydrophobes” (including or not the aromatic rings, which may be classified separately), “polar positives” (hydrogen bond donors and cations), and “polar negatives” (acceptors and anions). Unlike in force field typing, pharmacophore typing allows chemically different atoms being assigned to a same class (any lone-pair possessing heteroatom may in principle qualify as a hydrogen bond acceptor). Also, pharmacophore models do not provide any explicit characterization of the strength of interactions between features.

Next, a critical step in pharmacophore modeling is the conformational sampling of (a) known ligands (and known non-binders, essential negative examples in the machine learning process), to be used for pharmacophore extraction in so-called ligand-based approaches, and, (b) of all the candidate compounds from the electronic database that need to be confronted to the pharmacophore hypothesis, during the virtual screening. Conformational sampling is an extremely complex, multimodal optimization problem [9] that may require computer-intensive, massively parallel approaches for compounds exceeding a certain flexibility threshold (typically, several tens of rotatable bonds). Fortunately, drug-like compounds are less complex. Unfortunately, they are numerous. Reducing the conformational sampling time to a few seconds or, at most, minutes for each compound, using a simplified molecular force field for conformational strain energy estimations, does not guarantee the sampling of biologically relevant conformers. The sampling problem is far from being solved, as will be seen in §2.3.

The key step in pharmacophore modeling is obviously the construction of the pharmacophore hypothesis. A typical pharmacophore hypothesis (see Fig. 1) delimits a set of space regions (typically spheres) supposed to harbor functional groups of specified type when some low-energy conformer of the compound is optimally aligned with respect of it. The “optimal alignment” is the one allowing a maximum of spheres to be populated by corresponding groups – the relative importance of having each of them populated is an intrinsic estimate of the expected strength of the interaction it stands for.
Fig. 1

Typical ligand-based pharmacophore model (extracted [31] by Catalyst [109]/HipHop on the basis of active Cox-2 inhibitors). Spheres delimit space zones supposed to harbor functional groups of indicated pharmacophore types, such as is the case of the pictured overlaid Cox-2 ligand. Note that hydrogen bonding interactions (and sometimes aromatic stacking too) are, in most commercial software packages, rendered directionally – they specify both a position for the ligand heavy atom and its polar hydrogen and/or lone pairs, or, respectively, a sphere for the expected protein partner atom.

These space regions are supposed to represent a map of interaction “hot spots” where favorable contacts to the protein site take place – which is certainly the case in structure-based approaches where these hot spots are picked from experimentally determined ligand–site cocrystals structure geometries. Structure-based ligand-free hypotheses (potential interaction points obtained by mapping of the empty active site) or ligand-based hypotheses (regrouping some consensus motif seen in active ligands, and therefore thought to be important for activity) are no longer sure to include the actual binding hot spots. Note that ligand-based hypotheses may be build either from an overlay model of (what are thought to be) calculated “bioactive” conformers of active compounds, or from machine-learning driven extraction of common patterns seen in the spatial distribution of pharmacophore groups in active ligands. In overlay-free models, common pattern extraction may as well (or perhaps better, when drastic geometry sampling artifacts hamper 3D modeling) be performed with topological distance values measuring separation between pharmacophore features. Methodologies exploiting such “topological pharmacophores” have been discussed elsewhere [10].

Eventually, the actual virtual screening is performed by confronting candidate ligands to the pharmacophore hypothesis. Some quantitative measure of match between a candidate conformer and the hypothesis needs to be defined beforehand (in case of ligand-based methods, this may – but need not – represent the objective function used to overlay the ligands, prior hypothesis generation). In overlay-free approaches, scoring is provided by summing-up the weighed contributions of key pharmacophore elements, like in a QSAR model. Ligands having at least one well-scoring conformation are then considered as the “virtual hits” of the approach and should be subjected to synthesis and testing.

In terms of medicinal chemistry applications, the pharmacophore is often viewed as being complementary to the molecular scaffold. Scaffold hopping [11] became a central paradigm in drug design (see Fig. 2) – the quest of bioisosteric, topologically different structures, which nevertheless orient their interacting groups in space in a similar way to the starting compound and therefore display similar interactions with biological targets. Its importance stems from its ability to open new synthetic routes once that all the analogs around a given scaffold have been explored, to escape chemical space covered by scaffold-based patent applications, or to discover molecules with different pharmacokinetic properties but similar binding affinities with respect to the aimed target. Lead optimization is therefore alternatively oriented along two conceptually orthogonal research directions [12]: the sampling for various scaffolds compatible with a given pharmacophore pattern, and the sampling of various pharmacophore patterns that can be supported by a given scaffold.
Fig. 2

A typical ligand-based scaffold-hopping scenario: both (a) and (b) are Farnesyl Transferase inhibitors although they are chemically very different compounds, not based on a common scaffold. The overlay model, however, evidences the pharmacophoric equivalence of certain functional groups. Besides the extensive overlap of hydrophobic/aromatic moieties, the hydrogen bond acceptors mapped by the arrows are equivalently distributed in space although they actually involve different heteroatoms (oxygens vs. pyridine/imidazole nitrogen atoms).

Beyond scaffold hopping as defined above, new active molecules that features both a novel skeleton and a novel binding pattern to the target, hence corresponding to a completely original binding “paradigm,” typically represent a major breakthrough in drug discovery. In addition to the above-noted advantages, these might, unlike the classical inhibitors matching the well-established pharmacophore, bind to different site pockets and therefore present different specificity profiles within the family of closely related targets. Obviously, simultaneous scaffold and pharmacophore hopping cannot be achieved with methods that learn a pharmacophore from a series of known binders, which do not provide any information on alternative binding modes. Therefore, this is not, strictly speaking, scaffold “hopping,” for there is no reference scaffold to “hop” away from. The required information may only come from the protein structure, by designing novel putative binding pharmacophores, involving new anchoring points not exploited by known ligands. It is a potentially high-benefit, but certainly a high-risk approach, for not all the solvent-accessible hydrogen bond donors and acceptors in the active site are valid anchoring points (see discussion in Subsection 2.4.1).

Nowadays, pharmacophore-based virtual screening and modeling has reached maturity and has been extensively reviewed in past literature [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]. The goal of the present review is not to retrace the historical development of the domain, nor to provide any comprehensive list of commercial or academic software (programs are cited as mentioned in the reviewed articles). This contribution focuses on the very few latest years and the most interesting methodological developments and applications they brought. In the first section (Subsection 2), recently published technical issues will be presented, regrouped with respect to the strategies they address (structure vs. ligand-based, and, within this latter, overlay-based versus overlay-independent, respectively). Central key issues, common to all these approaches – such as conformational sampling – are discussed first.

Last but not least (Subsection 3), an overview of the latest pharmacophore virtual screening-based applications will be discussed, focusing only on articles presenting experimental validation of the found virtual hits.

2 An Overview of Latest Pharmacophore-Based Methodological Progress in Chemoinformatics

This chapter briefly reviews the latest methodological advances in pharmacophore modeling, covering first the general aspects (pharmacophore typing, conformational sampling), then structure-based and eventually ligand-based pharmacophore elucidation techniques.

2.1 Experimental Pharmacophore Detection

Although this is beyond the actual scope of this chemoinformatics-centric review, it is important to note that X-ray crystallography is not the only binding pharmacophore-elucidating technique. Nuclear Magnetic Resonance, monitoring either chemical shift changes of the resonance frequencies in the binding site, or relying on site/ligand Nuclear Overhauser Effects (NOE), may be used to extract the relative or even the absolute binding modes of ligands. In a recent contribution [26], it is shown that transfer-NOEs – transferring magnetization between the protons of two ligands competing for (weak reversible) binding to a common protein site, by means of the contact protons of the site – can be enhanced by reducing intraprotein spin diffusion (ideally, by deuterating the bulk protein, all in keeping the protons on the specific contact amino acids of the active site). A successful transfer of magnetization from the proton of a known “reference” ligand – of known binding mode – to a specific proton of another compound implies that the latter will occupy the same binding pocket. However, such methods are far from being routinely used in drug design, given the high degree of technological sophistication.

2.2 Pharmacophore Feature Detection (Typing): Does More Chemical Sense Make Better Pharmacophore Models?

Classically, pharmacophore typing is based on pragmatic rules: alkyl chains and halide groups are labeled as hydrophobes; aromatic rings may or may not be ordered into a specific “aromatic” category rather than joined to “ordinary” hydrophobes whereas polar groups are typed according to the number of attached polar hydrogens and/or lone pairs at expected protonation status. Also, the degree of resolution – flagging of individual atoms rather than considering “united” functional groups as pharmacophore feature carriers – is yet another empirical choice with respect to which various approaches largely differ. Some software tools allow the user to configure the pharmacophore typing schemes, others do not. To our knowledge, no two authors independently envisaging a pharmacophore typing scheme ever came up, by pure chance, with the same set of flagging rules. Some authors allow atoms to carry more than one pharmacophore flags (a carboxylate is both an “anion” and a “hydrogen bond acceptor”) while others do not (in this case, in order to comply with the “one group – one type” policy, some amphiphilic donor-acceptor type needs to be introduced for, say, alcohol – OH groups).

However, different rule sets may nevertheless lead to a same pharmacophore typing result in straightforward situations in which there is, chemically speaking, not much place to argue that a carboxylate is an anion, the t-butyl group a hydrophobe and the ketone carbonyl an acceptor. The more complex the molecule, the likelier that any given rule-based flagging approach will fail for some functional groups. In particular, the uncertainty of the actual bioactive tautomeric form and the presence of multiple ionizable groups influencing each other’s pKa values – such as the two aliphatic amine groups in piperazine rings, which are not simultaneously protonated at pH = 7, as rule-based flagging would suggest – are often sources of pharmacophore mistyping.

Recently, the introduction of a pharmacophore flagging scheme based on calculated estimations of pKa values of ionizable groups [27] revealed that some of the observed activity cliffs [28, 29] – structurally very similar compound pairs with diverging activities – may be explained by subtle changes in pKa values of ionizable groups, themselves translating into significant changes of relative population levels of the conjugated acidic/basic species. However, the application of this same strategy to generate fuzzy pharmacophore triplets as molecular descriptors used in QSAR studies [30] did not reveal any strategic advantage over classical rule-based flagging. QSAR build-up, in that context, can be assimilated to a pharmacophore elucidation approach, as it selects individual descriptors – pharmacophore triangles – seen to best correlate with observed activity. Or, pharmacophore learning – be it by QSAR-driven descriptor selection, inductive learning or overlay-based (see further chapters) – works best when the training set actives display as different a pharmacophore pattern as possible with respect to inactives. Therefore, the “best” pharmacophore flagging scheme, in this sense, is the one maximizing active versus inactive overall pattern dissimilarity, and not necessarily the physico-chemically most relevant one. The cited paper discusses one example in which the chemically wrong rule-based flagging scheme considered the benzodiazepine =N– in the 7-memebered ring as protonated. Yet, this wrong flag served as a primary marker of the – mainly active – set of compounds based on the benzodiazepine scaffold, by contrast to the mostly inactive ones, based on alternative scaffolds without fake cation.

The main problem [31] with pharmacophore typing scheme is, however, that the complex site–ligand interaction mechanisms cannot be rigorously understood in terms of some six or so functional group types. There is a universal consensus among all the flagging schemes – pKa-based or not – on the issue that a carboxylate group (acceptor, anion) is pharmacophorically different, and thus not interchangeable with the hydrophobe –CF3. Yet, it is also well known that the Cyclooxygenase II binding site easily accommodates –CF3 groups in a carboxylate binding pocket. This is an example of the fundamental limitation of the pharmacophore concept, which, all its successes notwithstanding, represents an extremely sketchy and poor model of binding interactions. Also, the protonation state of a bound ligand may be, due to the influence of the binding pocket, different from the most populated state in solution. This effect cannot be taken into account by pure ligand-based approaches and is extremely difficult to model even if the structure of the binding pocket is known.

A recent [32] and potentially significant progress, in this respect, relies on the idea to completely give up pharmacophore typing according to a very restricted number of possible features but represent atoms by the more fine-grained, context-sensitive force field types [7] used in molecular mechanics parameterization of intramolecular interaction energies. Such typing would be too fine-grained if the “pharmacophore” hypothesis were to be formulated in terms of requiring atoms of strictly specified types at key positions of a ligand. However, the authors allow for fuzzy type matching – the substitution of a key atom of force field type t in an active by a different atom of type t′ in a candidate molecule is hypothesized not to cause the loss of initial activity, in as far as types t and t′ are related. The idea of allowing partial “cross-matching” of related types is not new (for example, in [27] aromatics and hydrophobes were assigned different flags but considered to be partially replaceable). The key element of originality here is the idea to objectively measure the degree of equivalence of t and t′ based on how often they were actually seen to replace each other as anchoring points to a same “hot spot” of a protein site. Building a 3D “Flexophore” descriptor based on this flagging scheme, the authors find that “Flexophore descriptor detects active molecules despite chemical dissimilarity [from scaffold-hop benchmarking sets featuring no close analogs according to “classical” similarity scoring – N.A.] whereas the results for the screening of the complete data sets show enrichments comparable to chemical fingerprint descriptors.” A further assessment [33] of Flexophore technology with respect to DUD molecules confirmed this scaffold-hopping ability.

The “ultimate” pharmacophore typing scheme is the one (apparently or effectively) abolishing pharmacophore types altogether and describing the chemical environment of functional groups by molecular field intensities, like in the now classical CoMFA [34] methodology. CoMFA and a plethora of related techniques do however continue to rely on force field typing schemes for partial charge and hydrophobe property assignments. Electrostatic fields may, however, be derived from quantum-mechanical calculations, and some authors recently[35] argued that given the steady increase of available computer power, this may no longer represent a computational bottleneck. This grants an effective independence from any empirical atom typing scheme and generates valid ligand overlays (see §2.6.1), but there are no compelling studies showing that such techniques are systematically superior to atom typing-based approaches. Indeed, a hyper-accurate, quantum level description of single or few conformers meant to serve as input for empirical overlay calculations does not make much physico-chemical sense: binding free energy is related to the Boltzmann ensemble properties of free and bound states, not to some empirical overlay score.

2.3 Progress in Conformational Sampling Techniques

Conformational sampling techniques are thus a core piece of the pharmacophore-based screening techniques, and still a hot research topic although a plethora of commercial and/or free software dealing with the problem is already available. Recent contributions to the topic [36, 37, 38] mainly deal with technical issues – intelligent conformer space coverage strategies accounting for fragment symmetry, multi-objective evolutionary strategies. Not fundamentally new, yet faster and more robust, the latest published approaches fail, however, to address the critical point of the accuracy of calculated strain energies, in order to enable specific selection of relevant geometries and hence minimize the odds of fortuitous pharmacophore matching. Certainly, the use of a state-of-the art force field [36] instead of the simple Catalyst energy function [38] may represent a good strategy. There are no definite rules concerning the maximal strain energy still tolerable in a bioactive conformer (and there never will be, for the physics of the ligand binding process is controlled by free energy, a costly parameter that cannot be routinely calculated as part of pharmacophore virtual screening). Some studies [39, 40] suggest something like ~1 kcal/mol of tolerable strain for each rotatable bond – way less than the typical 20–50 kcal/mol cutoffs used with commercial software, independently of ligand flexibility.

2.4 Structure-Based Pharmacophore Modeling

The terminology “structure-based” has been coined to refer to models generated from known ligand–site binding modes, or of empty protein binding sites featuring potential anchoring points, by contrast to “ligand-based” approaches where pharmacophore inference relies on information contained in the structures of known ligands (and, optionally, non-binders). Knowledge of the target macromolecule structure – and, potentially, of its key anchoring points used to bind known ligands – is the main advantage of structure-based approaches. Obviously, their use is limited to the cases where such information is available.

Extraction of Binding Pharmacophores from Empty Active Sites

This is the most difficult and risky structure-based design scenario since it relies on a relatively limited amount of experimental information: the plain protein structure (sometimes merely a homology model), some working hypothesis (or, at best, mutagenesis-based information) for localizing the active site and its key residues, and molecular simulation-generated information on the potentially flexible active site regions. Yet, this is also the potentially most interesting application, applicable to orphan targets, and was one of the first to be addressed by developers [41, 42, 43], and nowadays supported by most of the pharmacophore-building software suites. The active site of the protein is first probed by – typically – some “dry” hydrophobic and some charged probe spheres, to generate a map of “hot spots” where these probes witness the energetically most favorable interactions. The “hot spots” are then clustered together and condensed into a most relevant set of pharmacophore feature spheres. Since there are countless possibilities to parameterize the monitored site energy maps and to condense the hot spots into a pharmacophore query, recent development is still ongoing in this field – the latest reported procedure [44] is based on the GRID [45] approach for hot spot mapping.

The Impact of Protein Flexibility in Structure-Based Pharmacophore Modeling

Typically, situations where structure-based design can be taken into consideration are rare and, if the prerequisite information concerning the active site is available at all, it may not go beyond a simple “snapshot” of the active site – empty or binding one ligand. Proteins are, however, flexible and may adapt to the incoming ligand. Therefore, the binding mode of a ligand may not be successfully inferred from the site geometry employed to bind another compound – or, for the matter, the empty site geometry. A recent, in-depth study [46] of structure-based pharmacophore extraction accounting for protein flexibility advocates the use of all the available active site geometries from various cocrystals (rather than using Molecular Dynamics-generated multiple protein geometries) in order to build a map of consensus interactions, present in >50% of the considered site conformations. These do not cover all the important anchoring points but have the merit of being entropically favored (they do not rely on the improbable event of having the active site adopt a very narrowly defined set of geometries). Molecules matching these key points have a good chance of being active, for they account for the “must-have” interactions with the rigid part of the active site. The remaining, flexible part of the active site is, by definition, able to “adapt” to the incoming ligands – generate new, unexpected favorable contacts, or, on the contrary, move away from the ligand and avoid bad contacts.

There are few macromolecular systems (such as the herein used Dihydrofolate Reductase) to boast the wealth of available ligand cocrystals structures needed for this ambitious study, which is likely to limit the interest of pharmaceutical industry for the methodology (a target known in such detail is no longer a “hot” issue for competitive drug development – but Molecular Dynamics-generated flexibility may be an alternative). Yet, results of the reported case study are quite encouraging. Pharmacophore hypotheses generated from multiple binding site conformers of human and Pneumocystis carinii DHFR were able to preferentially retrieve strong and weak DHFR binders over non-binders in virtual screening experiments. Furthermore, and surprisingly, they actually maintained their ability to select species-specific binding over promiscuous binders inhibiting the DHFR of multiple species although a loss of specificity is typically an expected consequence of flexible site modeling. This conclusion appears to be strengthened by the fact that pharmacophore models derived from Candida albicans DHFR, with a significantly smaller set of different site conformers than used in the previous two cases, do actually lose species specificity in virtual screening.

In a further study [47], it was shown that this procedure appears to be even more successful when based on NMR-derived ensembles of geometries. Both models from the NMR ensemble and a collection of crystal structures were both able to discriminate known HIV-1p inhibitors from decoy molecules and displayed superior performance over models created from single conformations of the protein, but the NMR-based model appeared to be the most general yet accurate representation of the active site. This is in agreement with the observation that there is more structural variation between 28 structures in an NMR ensemble than 90 crystal structures bound to a variety of ligands. This work encourages the use of NMR models in structure-based design.

Detecting Meaningful Pharmacophore Anchoring Points in Protein Sites

Above-mentioned (Subsections 2.2 and 2.4) failures and limitations of the pharmacophore-based affinity scoring schemes should be a surprise to nobody. For the matter, docking procedures, which ultimately differ from pharmacophore matching tools only in terms of the hyperfine atom typing scheme in the force field/scoring function, are hardly better off in this respect. All the hypothesized “favorable” ligand–site interactions contribute, to the overall free energy, some small and highly context-sensitive increment, representing a small difference of several conflicting high-energy effects. Hydrogen bonding, for example, implies a favorable electrostatic interaction between a partially positively polarized ligand/site hydrogen atom and some electron lone pair of a partner heteroatom – at the cost of desolvating these partners, which previously formed (equally strong? stronger? weaker?) bridges to water molecules. The herein liberated waters may now connect among themselves in the bulk solvent – a favorable contribution to the ligand binding energy balance, which involves neither site nor ligand. Two H bonds (ligand–water and protein–water) are broken, two are formed (protein–ligand, water–water). Is the energy balance nil? Positive? Negative? It may be either way, depending on very subtle effects, such as the entropic aspects of all these interactions (would the protein–ligand bridge rigidify the implied protein side chain? Is the water molecule interacting with this side chain, in the uncomplexed state, restricted in its ability to form additional H bonds? …). The hypothesis that protein–ligand H bonds are favorable may be statistically valid – they are more often favorable than not but this is of little help when the interaction of a peculiar ligand with a specified site has to be analyzed. Knowledge-based potentials are equally biased, for they are learned only on hand of complexes with predominating favorable contacts – else, they would not have been available for X-ray snapshots.

In-depth molecular simulations [48] aimed at accurately capturing subtle entropic effects are in principle required to understand the actual contributions of individual contexts. Unfortunately, even if feasible, they are anything but routine tools to serve in high throughput drug discovery. No matter how sophisticated a simulation, it is likely that key interactions seen in all the crystal structures are not spontaneously scoring better energy/pharmacophore match increments than alternative, “fake” contacts – valid interactions in theory, but never seen to happen in practice.

Alternatively, machine learning from known protein–ligand complex structures can help to understand which specific hydrogen bonds, salt bridges, and hydrophobic contacts, out of all the possible ones that could be established by solvent-accessible atoms of a protein active site, are the strongest contributors to binding affinity. A recent study [49] showed that the prioritization of cavity atoms that should be targeted for ligand binding can be achieved by training machine learning algorithms with atom-based fingerprints of known ligand-binding pockets. The knowledge of hot spots for ligand binding is here used for focusing structure-based pharmacophore models.

The idea was taken one step further [50], from simple selection of significant interaction patterns, to a continuous weighing of considered interaction patterns (called “shims” in the original work) according to a partial-least-square analysis of the relationship between observed binding affinities and population status of each monitored interaction pattern. The herein calibrated “shim” contributions can be used as a target-specific correction of standard scoring functions of docking poses, or to learn affinity-predicting empirical models for homologue targets (for which crystal structure is not needed).

As always, machine learning is a powerful tool to evidence known patterns and to discover related ones, by means of limited extrapolation. While learning “hot spots” – the protein interaction sites actually used by some known ligands – from “cold” interaction sites (interaction opportunities not exploited by ligands), the method ignores whether the latter might be used by some not yet known ligands, or whether they are intrinsically inappropriate for binding. It was evidenced [51] that simple prioritization of the consensus features seen in multiple complexes of a protein with different ligands is enough to improve the enrichment scores of pharmacophore-based virtual screening experiments – sophisticated machine learning is not necessarily a must. Like always, consensus modeling consolidates what is known but minimizes the chance of discovery of radically new binding modes.

2.5 Progress in Automated Ligand Superposition and Ligand-Based Pharmacophore Elucidation

Since, in absence of an active site model, the putative ligand–site anchoring points remain unknown, ligand-based pharmacophore elucidation is typically based on some overlay model of active ligands, in order to evidence spatially conserved pharmacophore groups – and to assume (rightly or wrongly) that they are conserved because this is where the interaction with the active site occurs. Such alignment models are highly empirical and based on more than one shaky working hypothesis. In addition to the one cited above, lacking knowledge of the bioactive conformer forces ligand overlay users to assume that if two ligands possess a set of compatible geometries (that can be overlaid, pair-wise), then their bioactive conformers are somehow (!?) part of this set. Technically, the methods for searching (with respect to intra- and/or inter-molecular degrees of freedom, i.e., rigid/flexible overlays) and scoring the achieved overlay quality (typically some refined counting of the common features falling within a same pharmacophore feature sphere) vary widely, and – since based on empirical choices – none can be a priori considered superior. These approaches are rarely given a detailed description in literature, are often plagued by obscure empirical parameters set to some undocumented default values, and are often reinvented. Some of them are discussed in other reviews [14, 15, 17, 21, 23]. It is nevertheless important to point out that ligand-to-ligand fitting is intrinsically faster than ligand-to-site fitting (docking) – mainly because force field-based docking approaches include long-range interactions with protein atoms that are not necessarily in direct touch with the ligand. Using geometric and shape-matching techniques, authors [52] have actually managed to translate the docking problem into a ligand-to-ligand fit problem even in absence of an actual bound ligand, by rendering the site cavity as a virtual ligand. Latest contributions to the ligand overlay problem will be briefly discussed in the following.

Pharmacophore Field-Driven Superposition and 3D QSARs

The idea of using fuzzy pharmacophore “fields” [31, 53] of continuously decreasing intensity as a function of distance from their sources (typed atoms), instead of fixed-radius feature spheres containing or not the atoms of matching pharmacophore types, has been recently revisited several times [54, 55, 56, 57]. The method assumes the optimal overlay to be the one maximizing the degree of overlap of corresponding pharmacophore fields (irrespectively whether they are entitled “fields [53]” or “Gaussian volumes [57]”), over the entire space surrounding the superimposed molecules. As such, it is less artifact-prone, for it does not demand any strong assumptions concerning the radii of classical pharmacophore feature spheres (instead, a fuzziness parameter may be used to smoothly control the tightness of match). Unsurprisingly, all the authors opted for the same empirical functional form of pharmacophore field intensity with respect to the distance to the source – a Gaussian, very useful for analytical calculation of field overlap integrals. Typing and implementation details differ (notably, a pharmacophore hypothesis [54] is used as an overlay template, rather than the structure of a singled-out active compound [53]) – however, authors have seized the opportunity to use the pharmacophore field maps corresponding to optimal overlays as descriptors for 3D-QSAR training (which, in this case, can be assimilated to a pharmacophore elucidation process – but beware of surprises [31]!).

Pharmacophore Model-Driven Overlay

At first sight, this option makes little sense because ligand overlay is a prerequisite to pharmacophore model building, rather than being piloted by the latter. However, recent work [58] showed that it is possible to “co-evolve” overlay model and pharmacophore model (or receptor model, as termed in that work). Starting from some random alignment of random ligand conformers of known binders, a first – likely meaningless – pharmacophore model is established, then ligands are realigned with respect to the latter and stochastic iterations are pursued, featuring random changes in ligand geometry, position, considered receptor model points, etc. This approach constructs a receptor model setting “points” at space positions presumably occupied by the active site atoms, instead of feature spheres. Model scoring is not based on an overall ligand-to-ligand overlay score but on the goodness-of-match between the ligands and the emerging pharmacophore model, with a penalty term for pharmacophore model complexity. Thus, the approach seeks for the minimalistic pharmacophore model able to simultaneously accommodate all the ligands while simultaneously discovering how the ligands must be aligned. As a consequence, if the training set includes ligands covering several different binding modes, the minimalistic pharmacophore model should cover all of these modes, with each ligand matching its own specific subset of hot spots. The authors note the high enrichment ratios achieved by the superposition method even in comparison with procedures that exploit the protein crystal structure. However, ligand flexibility leading to a combinatorial explosion of the problem space volume is a critical issue in this approach – as in any other attempt to build ligand-based pharmacophore models.

Feature Pairing-based Overlay Algorithms

In pharmacophore field-based methods, six (3 rotational + 3 translational) degrees of freedom per overlaid ligand need to be exhaustively explored in order to pinpoint the optimal relative alignment. This may be costly but does not require any a priori matchmaking between corresponding functional groups, expected to be brought together during the overlay. Knowing beforehand which functional group of a ligand needs to be posed atop a given group of the reference compounds deterministically fixes the roto-translation required to minimize the Root-Mean-Square (RMS) deviation between each reference group and its overlaid counterpart. Finding, in a molecule m, the functional group equivalent to a given feature in the reference compound M, is however not an easy exercise for a medicinal chemist (molecular overlay techniques were actually created to help visualizing the bioisosterism of apparently different groups) but can be successfully automated, as shown in a recent study [59]. Features in m and M need to be characterized by descriptors of the pharmacophore pattern surrounding them – then, features from m are putatively associated to the counterparts in M witnessing a similar surrounding pharmacophore pattern. This matching is sometimes far from being obvious. For example, if surrounding pharmacophore patterns are described only in terms of inter-feature distances, the algorithm ignores molecular chirality and likely suggests some impossible alignment. In the herein discussed implementation, this is not a fatal flaw, for alignment based on a pruned set of less stringent list of equivalent groups will be reattempted until some coherent pose is found. Near-optimal overlays can thus be effectively generated without needing to exhaustively explore a six-dimensional space.

However, this method is supposed to work best for compound pairs which do display a significant degree of pharmacophore feature similarity, thus containing enough unambiguously matching pharmacophore feature pairs. Pairs of marginally pharmacophorically similar compounds are better overlaid using the field approach. For example, consider a small molecule m and a much larger molecule M embedding a fragment m′, bioisosterically equivalent to m. The pharmacophore pattern descriptors of the features in m are, objectively, very different from those of the equivalent features in m′ (now surrounded by many more functional groups than there are in m). It is unlikely that they will outline m′ as equivalent group of m. Systematic rotations and translations, however, would eventually overlay m atop of m′ within M, and likely maximize local field covariance.

It is interesting to note that feature pairing-based alignment procedures can be, unlike field-based approaches, elegantly treated as a geometrical embedding problem [60] of a “hypermolecule.” This is defined by the ensemble of ligands to be overlaid, where classical geometrical constraints are considered within each ligand (bond lengths, nonbonded exclusions). Atoms of different ligands do not see each other (are allowed to overlap – no inter-ligand non-bonded exclusions) and additional distance constraints are added to actually force equivalent pharmacophoric groups of different ligands to overlap. In fact, pharmacophore matching constraints can be refined in order to account both for spatial overlap of equivalent groups and the preservation of the directionality of hydrogen bonding or aromatic stacking interactions. Stochastic proximity embedding, an approach previously used to conduct distance geometry-based conformational sampling, was now successfully generalized to generate conformers for each ligand, which overlap in terms of their equivalent groups. The elegance of the approach resides in the fact that intramolecular geometry and intermolecular alignment constrains are treated equivalently by the procedure. Unfortunately, the obtained geometries are merely guaranteed to be clash-free and feature more or less correct bond lengths and valence angles values – intramolecular energy is not explicitly being calculated in distance geometry approaches. Therefore, an energy-refinement postprocessing step is mandatory.

The Multiobjective Approach to Ligand Overlay

In flexible ligand overlay, the trade-off between goodness of fit (degree of overlap) and the acceptable strain energies of the overlaying conformers is a key empirical parameter, which is very difficult to set. How much strain energy is acceptable to increase, say, the field-based covariance score from 0.8 to 0.85? Alternatively, how much strain energy is acceptable to decrease the RMSD of the positions of equivalent functional groups by 0.5 Å? There is clearly no unambiguous way to weigh energy in kcal/mol against empirical overlay goodness scores (dimensionless correlation coefficients) or RMSD (Ångstrom). Strain energy and goodness of overlap are often conflicting objectives as energetically absurd geometry deformations may eventually allow any two compounds containing pharmacophorically equivalent groups to be perfectly overlaid. The total overlap volume is yet another independent overlay monitoring criterion – ensuring that the superimposed ligands are “squeezed” into a minimal volume. This may be mandatory for targets with very narrow active sites where suboptimal feature overlay and/or increased strain energies are the price to pay for fitting the site. Multiobjective optimization – the explicit dealing with multiple, potentially conflicting objectives to optimize, rather than forcefully selecting some empirically weighted linear combination thereof as the “ultimate” goodness criterion – is however a well-defined domain of numerical problem solving. In recent work [61], Pareto ranking methodologies were used to sample the space of possible partial overlays according to the three above-mentioned conflicting criteria of goodness-of-overlay, strain energy, and total overlay volume. The multiobjective framework leads to the identification of a family of plausible solutions, where each solution represents a different overlay involving different mappings between the molecules, and where the solutions taken together explore a range of different compromises in the objectives. The solutions are not ranked but are presented as equally valid compromises between three objectives, according to the principles of Pareto dominance. The approach also takes into account the chemical diversity of the solutions, thus ensuring that they represent a diverse range of structure–activity hypotheses, which could be presented to a medicinal chemist for further consideration. It remains, however, unclear, how exactly to use this series of reasonable hypotheses for virtual screening. A consensus approach, picking only candidates matching all of these, may appear too restrictive since multiple hypotheses were generated to underline the fact that in diverse sets of binders, multiple anchoring patterns may coexist. The union of all features would, by contrast, generate a way too complex query retrieving only partial matches when confronted to real drug-like compounds in databases – unfortunately, the methodology is still under development and still has to prove the relevance of these partial matches.

Let Machine Learning Find Out How to Best Pilot Ligand Overlay!

Based on a training set of almost 70,000 “reference” overlays of protein–ligand complexes – generated on the basis of conserved amino acid residues in the protein sequences – authors [62] have recently shown that machine learning may predict what overlay template to use and which of the available software tools to employ, in order to maximize chances to reproduce, by means of ligand–ligand overlay, the “reference” ligand–ligand alignment. Random Forest models, trained using standard measures of ligand and protein similarity and Lipinski-related descriptors, are used for automatically selecting the reference ligand and overlay method maximizing the probability of reproducing the reference overlay deduced from X-ray structures (RMSD = 2 Å being the criteria for success). These model scores are highly predictive of overlay accuracy, and their use in template and method selection produces correct overlays in 57% of cases for 349 overlay ligands not used for training. The inclusion in the models of protein sequence similarity enables the use of templates bound to related protein structures, yielding useful results even for proteins having no available X-ray structures.

Alignment Rendered Simple: PhAST, the Linearized Pharmacophore Representation

Chemoinformaticians always envied bioinformaticians who deal with linear, diversity-restricted, and hence easy-to-align compounds, at the amino acid/nucleotide sequence level. Or, the pharmacophore paradigm allows boiling down the vast diversity of organic functional groups to a limited set of less than ten standard pharmacophore types, allowing the simplified representation of organic molecules as 2D graphs colored by pharmacophore types, and which can be thought of as the alternative to the standard bioinformatics “alphabets” of 20 amino acids or four nucleotides. The next logical step consists in linearizing this colored molecular graph to obtain a canonical sequence of pharmacophore types. Such sequences may then be aligned and compared according to bioinformatics-inspired metrics, dealing with simple operations of gap insertions instead of costly 3D rotations and field overlap scoring. Certainly, compression of the 2D graph into a 1D sequence invariably triggers much loss of information. Nevertheless, the above-mentioned approach, recently [63] developed and tested, favorably compared to other virtual screening approaches in a retrospective study and identified two novel inhibitors of 5-lipoxygenase product formation.

2.6 Alignment-Free Ligand-Based Pharmacophore Elucidation

Ligand-based pharmacophore elucidation requires the detection of a consensus subset of features that are shared by all the actives using a common ensemble of anchoring points to the active site. Alignment, per se, is not a goal but merely a (rather costly and parameterization artifact-prone) way to outline this consensus subset of conserved features although it collaterally provides an intuitive depiction of the hypothesized binding mode. The classical form of alignment-free pharmacophore elucidation is QSAR-driven selection of 3D pharmacophore multiplets (pairs [64], triplets [24], quadruplets [65], etc.) that seem to correlate with activity, throughout a training set. The considered variables (binary yes/no population toggles, or fuzzy population levels of the multiplets) depend only on the considered conformers – see discussions in §2.3 – but not on their orientations. Unless fuzzy logics is used, geometry artifacts are however a key problem of this approach, which may be extremely sensitive with respect to the underlying conformational sampling procedure. Recently, authors [66] showed that “alignment-based alignment-free” approaches (using alignment-free multiplet descriptors, derived however from conformers that are able to smoothly align with geometries of other ligands) are better performers – doubtlessly due to the beneficial filtering stage represented by the alignment step, discarding many nonsense geometries.

QSAR-Driven Pharmacophore Elucidation?

Picking [67] relevant 3D pharmacophore triplets that correlate to the measured metabolic stability with respect to a given cytochrome (CYP 2D6) allows highlighting feature combinations that appear to facilitate binding to the enzyme in a way that aids substrate oxidation. However, it is highly unclear why pharmacophore signatures account for both the actual mechanistic aspect they were designed for – that is the ability of a substrate to be bound by the enzymatic site – and, in addition, predict whether the oxidation of the bound ligand will take place or not. True, no binding logically implies no metabolization – in this sense, filtering out the compounds that cannot possibly fit into the 2D6 active site is already significant. However, binding does not automatically imply metabolization – the article does not make a clear distinction between cytochrome inhibition and metabolic stability. QSAR, however, has long since been known to typically rely on correlations void of any causal background: the fact that metabolically unstable compounds tend to have a common signature in terms of 3D pharmacophore fingerprints does not yet imply that the respective triplets are mechanistically involved in the studied property. In this particular case, selected pharmacophore features seem to match reasonably well known anchoring points in the CYP 2D6 site (which is still, per se, insufficient to explain stability). Other studies [30, 31] have clearly highlighted that valid pharmacophore descriptor-based QSARs are more likely to evidence typical signatures of actives versus inactives. Such signatures may converge toward the set of required binding interactions only if care is taken to use a highly diverse training set. In this context [30, 68, 69], it is worth mentioning that 3D information is not needed to highlight different pharmacophore pattern signatures – therefore, 2D pharmacophore fingerprints [10] are as useful in QSARs as their 3D counterparts (if not more useful, since void of conformational sampling artifacts).

Graph-Theoretical Approaches

Clique detection is a graph-theoretical approach mining for frequent common subgraphs within a set of graphs. A recent paper [70] reports an adaptation of this algorithm to deal with ligand-based pharmacophore elucidation while describing the pharmacophore pattern of each conformer in each ligand as a doubly annotated fully connected graph. Vertices are “colored” by the represented pharmacophore type while edges linking each feature to all the others are labeled by the corresponding Euclidean distances in a conformer. A first implementation mines all frequent cliques that are present in at least one of the conformers of each (or a portion of all) molecules. The second algorithm exploits the similarities among the different conformers and achieves significant speedups. These algorithms are able to scale to data sets with arbitrarily large number of conformers per molecule and identify multiple ligand binding modes or multiple binding sites of the target. A related approach is used by the MedSumo-Lig [71] software to calculate the match score of the graphs of pharmacophore triangles found in ligands.

Artificial Intelligence in Pharmacophore Elucidation

Alternatively, Inductive Logic Programming (ILP), a class of machine-learning methods able to describe relational data directly, can be used to express pharmacophores as a logical inference based on predicates related to the nature and relative distances of the key features entering the pharmacophore. In a recent publication [72], putative pharmacophore “hot spots” (features) are assigned to the energy minima of various Molecular Interaction Fields, generated by letting the molecule, in its current conformation, to interact with polar and respectively hydrophobic probes. Based on a list of actives – and, optionally, but highly desirable, inactive examples, the program seeks for inference rules like “A molecule is Active if (a) it possesses a hydrophobic interaction hot spot H1, AND (b) it possesses a positive charge interaction hot spot PC1, AND…, AND the distance between H1 and PC1 lays between, AND…”. Such rules may first be derived on the basis of a single active, then challenged to explain other actives: the higher the “coverage” – the generality – of the rule, the more trustworthy it can be considered. The method might actually deal with data sets of actives of different classes, which do not bind according to a common pharmacophore – and should successfully elucidate the characteristic pharmacophore of each class. Next, it has to be challenged with prediction of inactives in order to discard unspecific rules predicting all the molecules to be active. The great advantage of the method is the human-interpretable definition of the elucidated pharmacophore. Disadvantages, however, concern the likeness to extract artifactual, locally applicable rules which seem to hold for the training set due to the peculiar selection of included compounds, but are not genuine “rules of nature,” like in classical QSAR [30]. However, the space of all the possible inference rules being huge – perhaps larger than the problem space of a typical attribute selection-based regression problem, it is difficult to assess the likeness of producing artifactual rules by inductive logic programming.

Even more recently [73], Inductive Logic Programming was used to generate pharmacophore-based sets of activity rules by means of known actives and inactives for several targets (this time using classical atom typing rules, not molecular interaction fields). However, these rules were not directly used as such to predict whether new structures are active or not but served to generate, for each molecule, a binary “rule compliance” fingerprint, in which bit number i is set for molecule M if M is predicted to be active according to rule i. Similarity screening was then reformulated as “Molecules with similar rule compliance fingerprints tend to have similar activities” and used to seek for neighbors of actives within a test dataset, containing hidden actives belonging to different chemical classes, in order to enforce “scaffold hopping.” In parallel, classical similarity-based screening using state of the art scaffold hopping fingerprints CATS [74] was performed. Only the rule compliance fingerprints appeared to perform better than random selection. However, the herein reported comparison of a machine-learning based technique to an unsupervised similarity search is not very informative. Rule compliance fingerprints encode valuable chemical information learned from examples of actives and inactives. CATS fingerprints do not – they rely on the most basic assumption that activity similarity can be related to overall similar pharmacophore patterns (with no distinction made between actually binding groups and “bystanders,” included to enhance physicochemical properties, or simply to make synthesis possible). Rule compliance fingerprint performance should rather be compared to a set of fingerprints based on multiple QSAR models, where locus i for molecule M contains the predicted activity of M according to, say, linear regression or Bayesian classifier model number i. Also, the number of hidden actives per target was, in this study, extremely small – more in-depth benchmarking is required to evidence the real advantages of this otherwise elegant and promising inductive logic programming-based approach.

Partitioning-Based Pharmacophore Recognition

An interesting, recently reported approach [75] aims to define the common pharmacophore of a set of binders by detecting common k-point pharmacophore patterns (with an as large k as possible) in all the binders. In order to single out the potentially equivalent k-point pharmacophores, these are classified into cells based on the inter-feature distances, with k-point patterns belonging to a common cell (or within neighboring cells, to avoid binning artifacts) and found in all candidate ligands are considered for a final 3D overlay to assess the actual degree of spatial match (this is required, as the distance-driven binning scheme does not account for pattern chirality). A recursive distance partitioning algorithm is used to mine for reasonable pharmacophore classification schemes, leading to meaningful common pharmacophores.

2.7 Tuning of Pharmacophore-Based Virtual Screening Approaches: Efficiency Versus Performance

A recent study [76] on the impact of several screening parameters on the hit list quality of pharmacophore-based and shape-based virtual screening showed, intriguingly, that pushing the conformational sampling and the pharmacophore matching algorithm “too” far may prove detrimental to specificity – in the sense that careful searches for spurious poses of spurious conformations to match a given pharmacophore may eventually succeed. The authors recommend the use of CATALYST [77, 78] databases with a limit of maximally 50 generated conformers per ensemble and FAST generation algorithm combined with FAST database search as the default pharmacophore screening setup. Rising the number of considered conformers, or spending more time to match any given conformation to a pharmacophore increases the retrieval rate of actives though not specifically so: it becomes generally more likely to see some conformer of an inactive fitting, by chance, the pharmacophore models. Such observations are not new: cases reporting better performance of 3D pharmacophore fingerprints built on the basis of fewer conformers [79, 80] were already reported – then recently rediscovered [81] and deemed “surprising.” Similarly – at least in appearance – parsimony is not only recommended in terms of conformer set sizes, but also in terms of monitored characteristics in pharmacophore fingerprints, where bit reduction/silencing may actually enhance performance [82]. However, the latter effect is not related to conformational sampling artifacts but due to intelligent focusing on the relevant fingerprint bits.

This is a fundamental problem, for it is tacitly assumed that considered conformer-to-pharmacophore model overlays represent the absolute optimum of a goodness-of-match score over all the possible degrees of freedom (intra- and inter-molecular) of the problem. In reality, intramolecular degrees of freedom are “decoupled” from the pharmacophore match problem and treated separately – conformational sampling is performed “off line,” not considering the pharmacophore to be matched and thus not favoring pharmacophore-matching geometries. Even so, too thorough an enumeration of possible geometries is likely to return conformers matching “everything” in terms of pharmacophore patterns. In as far their relative stability cannot be determined rigorously, in order to estimate relative populations in solution at room temperature, geometries that make physical sense are “hidden” amongst the many output by commercial software and cannot be foretold. Force field energy errors reach way beyond the characteristic kT = 0.6 kcal/mol – not to mention that, rigorously speaking, conformational free energies are the one controlling conformer population levels. For all these reasons, the unreliable intramolecular strain energies provided by the sampling tools are usually ignored when performing pharmacophore matches – all the geometries found within the geometry database are seen as equally valid hypotheses and the more of them, the likely that one will eventually match the pharmacophore query.

The good news highlighted by this study – in agreement with other assessments of the likelihood of commercial sampling software to generate, among others, the bioactive conformer [83, 84, 85] – is that, at least as far as the typical co-crystallized ligands are concerned, some reasonable geometry is often found among the best conformers (more of the best conformers needing to be considered for more flexible ligands). Therefore, actives adopting bioactive geometries close to default solution structures are more likely to be discovered when adopting a reduced conformer set strategy, for inactives then stand fewer chances to contribute some spuriously matching geometry. The bad news, however, is the inability to define some rigorous cutoff for the maximal excess strain energy that still allows a conformer to putatively qualify as bioactive geometry, for force-field-based energy values are way too imprecise. Therefore, the empirical cutoff in terms of optimal numbers of considered conformers makes no physical sense at all (and is likely problem- and software-dependent). Whether or not the bioactive conformer is part of the considered set and whether or not inactives will produce spuriously matching geometries (which should have been discarded if their actual energy levels would have been known) are two highly serendipitous aspects of pharmacophore-based virtual screening.

2.8 Pharmacophore Match: An Applicability Domain Definition for Docking?

Flexible docking [86, 87, 88, 89] of putative ligands into an X-ray structure of the target site is expected to be the more powerful, comprehensive approach to modeling of ligand–site interactions because they allow – in principle – the discovery of any putative binding mechanism. Pharmacophore matching may only outline whether a ligand matches a specified binding mode or not, whereas in docking the ligand has to “choose” the physically relevant binding mode out of all the possible, given an active site allowing for a certain number of contacts of specified nature. Reality is however different, as shown in a recent publication [90]. It either may be that sampling of possible poses and ligand conformers is insufficient – and the relevant one never gets enumerated, or that the correct one is being enumerated, but not ranked as the top stable one due to force field/scoring function artifacts. If all poses for each compound are passed through different pharmacophores generated from co-crystallized complexes, significantly larger enrichment factors (at a same selected subset size) are obtained based on the top-scoring passing pose of each compound. This, however, does not mean that pharmacophore models are, per se, a more realistic or more complete description of the binding site. They do perform better because they rely on additional experimental information – the ligand–site binding geometries seen in X-ray structures. Obviously, whenever a docking program generates a structure that is close to a known binding mode, it stands fair chances of having discovered an active. Poses which, however, do not match known binding modes, but nevertheless score well, are highly likely to represent scoring artifacts. Pharmacophore matching simply counts putative favorable contacts and would likely not having scored any better than docking if the list of considered contacts would have not only included the ones actually seen in experimental complexes, but all contacts that might have been possible within the active site. Why the specific hydrophobic contacts, hydrogen bonds or salt bridges seen in experimental structures seem to contribute much more to complex stability than other, theoretically at least as valid hydrophobic contacts, hydrogen bonds or salt bridges show the intrinsic limitation of both and conceptually quite related pharmacophore and scoring function/force field typing methodologies. All pharmacophore contacts appear equal in the virtual screening methodology, but some (the “native”) are “more equal than the others.” The underlying reason is likely hidden within highly subtle flexibility-and solvent-induced enthalpic and entropic effects that cannot be modeled at the typical resolution scale of the pharmacophore paradigm. The entire, booming, research field of docking interaction fingerprints [91, 92, 93, 94] came to life in order to allow a quick discrimination of native-like poses from exotic, likely artifactual ones. Pharmacophore matching or fingerprint analysis, both these approaches aim to restrict the Applicability Domain [95, 96, 97] of docking – which can be viewed as a complex, non-linear Quantitative Structure–Activity Relationship [98, 99, 100] – to the neighborhood of the experimentally known realm. However, the price for safely high hit rates is the risk of discovering only “dull” analogs of known binding modes while filtering out the rare, but original ligands that actually do bind differently.

3 Recent Applications of Pharmacophore-Based Virtual Screening

Medicinal chemistry publications of the latest years abound in applications of pharmacophore-based virtual screening approaches used to discover new bioactive compounds. As noted in Introduction, this is now a mature chemoinformatics approach, intuitive and representing an apparently satisfactory trade-off between physico-chemical rigor and computational effort. Categorizing interactions in “hydrophobic,” “hydrogen bonds” and “salt bridges” is apparently good enough to let us explain, in many cases, the mechanism of ligand–site binding.

Unfortunately, we have no reliable records of failed pharmacophore-based virtual screening attempts, which never made it to the press. Obviously, the reasons for these failures are unknown – they may range from the simple absence of active compounds matching the pharmacophore pattern (not a methodological problem) to “drowning” of actual actives amongst too many false positives. At a technical level, they may be due to improper atom typing, unsatisfactory conformational sampling, biased machine learning/pharmacophore elucidation or sheer inappropriateness of the pharmacophore paradigm in that context (very flexible protein targets, atypical interactions requiring quantum-mechanical modeling, etc.).

Another less than welcome peculiarity of the recent literature concerning pharmacophore models is the wealth of publications, which apply existing virtual screening techniques – in most cases, a proper use thereof being reported, and very often involving a cascade of quick empirical filters followed by more rigorous docking approaches – but stop short of experimental validation of the selected hits [101, 102, 103, 104, 105, 106]. The reported work may well be commendable and valid – yet, what was it done for if there is no interest from experimentalist groups to assess the activity of virtual hits and to continue developing them into useful bioactive compounds?

Also, it is important to point out that, in the experimental laboratory, the “pharmacophore” concept is sometimes rather loosely used to denote whatever structural feature is held responsible for the activity. It is not rare to hear mentioned the “benzodiazepine pharmacophore,” in the sense of “benzodiazepine scaffold.” For example, Brizzi et al. [107] report “a novel pharmacophore consisting of both a rigid aromatic backbone and a flexible chain with the aim to develop a series of stable and potent ligands of cannabinoid receptors.”

The following is a brief and nonexhaustive review of the most recent successful virtual screening applications, followed by experimental validation:

3.1 Recent Structure-Based Success Stories

Type 1 11-β-Hydroxysteroid dehydrogenase (11-β-HSD1) inhibitors were discovered [108] by a Catalyst [109]-based virtual screening, using an active site-derived pharmacophore model generated with LigandScout [110] on the basis of a enzyme-inhibitor complex X-ray structure. Pharmacophore-matching hits were further on filtered by docking into the active site, using GLIDE [111]. Finally, 56 compounds were selected and submitted to biological testing. Eleven compounds with IC50 values below 10 μM were found, featuring three new chemical scaffolds as 11β-HSD1 selective inhibitors.

Scaffold hopping (see also §3.2) is mostly cited in conjunction to ligand-based design although structure-based design starting from known site–ligand complex structures is as well amenable to the discovery of binders of new chemical classes. In fact, there is no sharp delimitation between the two strategies. The knowledge of the binding geometries from cocrystals X-ray structures may well be used in order to generate a ligand overlay model from the superposition of the entire complexes, thus leaving no more room for doubts concerning the correctness of binding geometries and overlay mode. This ligand overlay may then serve for consensus pharmacophore extraction, its encoding under the form of molecular fingerprints as classically seen in ligand-based approaches, followed by quick database screening. The herein retrieved virtual hits may then be revisited in the structure-based framework and docked into the active site. Such a strategy was recently applied [112] for the discovery of new PPAR-γ agonists, starting from the cocrystals structures of several natural compounds binding the receptor. Common pharmacophore features of considered natural ligands were singled out after the overlay of the corresponding complexes, coded under the form of a LIQUID [113] pharmacophore and used for database screening. Primary hits were docked into the PPAR active site, and two out of eight tested molecules (all chemically different with respect to the initially considered natural products) were found to display significant activity in a cellular reporter gene assay.

3.2 Chemotype-Hopping Applications

Discovery of binders based on radically different chemical structures (chemotype-, scaffold-, or lead-hopping) is the main purpose of pharmacophore-based modeling, the more so knowing that a medicinal chemist’s brain may well understand scaffold-based similarity but fails to visualize spatial complementarity of equivalent functional groups. Therefore, the computer is, in this respect, a truly complementary tool to “chemical intuition.”

A classical Catalyst [109]-based pharmacophore screening for serotonin 2C receptor ligands, leading to the discovery of novel nanomolar binders, not implying any special tuning or novel computational tool development, was recently reported [114]. It shows that commercial software may provide, as far as pharmacophore searching goes, valid “keys in hands” solutions for the medicinal chemistry lab.

A radical and successful example of discovery of nonsaccharide organic (tertraisoquinoleine-based) molecules to mimic the activating effect by the extremely complex sugar Heparin of the coagulation factor Antithrombin (a plasma glycoprotein serine protease inhibitor) was achieved [115] by means of simple, interactive design of compounds fitting a heparin-inspired pharmacophore.

The discovery [116] of a highly active Carbonic Anhydrase inhibitor, out of as few as six (scaffold-wise unrelated) selected virtual hits, has been achieved by means of a combined strategy involving a cascade of successive ligand-based (ligand overlay-based pharmacophore screening), structure-based (docking into a homology model of the protein), and chemical intuition-driven selection pharmacophore model building. The MOE [117] software suite has been used.

One of the most radical examples of scaffold hopping is the design of dual site inhibitors of macromolecules having a cofactor binding site not far from the actual substrate fixation site. It may then be possible, using the pharmacophore model derived from the ternary protein–substrate–cofactor complex, to virtually screen for compounds simultaneously binding to both sites, which may be entropically beneficial. Of course, such inhibitors will be chemically different from either of the previously known substrate-competitive or cofactor-competitive inhibitors. A recent example [118] targeting Dihydrofolate Reductase (DHFR) of the anthrax bacillus used the Sybyl [119] software suite for pharmacophore screening followed by docking to discover micromolar, allegedly “bidentate” ligands blocking both the methotrexate and the NADPH cofactor site. Out of 15 selected molecules, two displayed low micromolar activity against DHFR. Neither are derivatives of traditional antifolates, an advantage being that these structurally and chemically distinct compounds possibly represent the first leads of two new classes of DHFR inhibitors. It is thus possible to combine key interaction points from different sites into a novel pharmacophore model – if these are reasonably close in space, and the resulting model is not overwhelmingly complex.

Pharmacophores can be successfully applied to model biological effects beyond simple ligand binding, such as, for example, protein–protein heterodimer disruption. Without knowing the exact interaction mechanisms, a ligand-based approach [120], using GALAHAD [121] allowed to extract common pharmacophore features of a set of known disruptors of the c-Myc-Max dimer, then screen the Zinc database using Sybyl [119]. Nine compounds, none within the chemical class of the pharmacophore training molecules, were tested with significant degree of success in both in vitro and cellular tests.

3.3 There is No Single Best Virtual Screening Approach: The Importance of Testing Alternative Methods and Consensus Scoring

An interesting study [122], leading to the in silico discovery of a potent Human Immunodeficiency Virus (HIV) Entry blocker, binder of the CXCR4 chemokine receptor, outlined the importance of benchmarking a maximum of possible virtual screening tools with respect to the considered target, in order to eventually pick hits among molecules consensually predicted to be active by the methods best performing in the tests. Considered methodologies included both ligand-based approaches (QSAR and pharmacophore modeling with MOE [117] and Discovery Studio [123], shape matching tools like PARAFIT [124], ROCS [125], and HEX) and structure-based approaches (docking with AUTODOCK [126], GOLD [127], FRED [128], and HEX [129]) based on a homology model of the receptor. The methods were fine-tuned in a retrospective virtual screening, and then successfully used in a prospective search.

New Hormone Sensitive Lipase inhibitors were discovered [130] by an original virtual screening approach using QSAR models combining pharmacophore hypothesis matching scores (determined with respect to a basis set of Catalyst [109]-generated pharmacophore hypotheses) and classical molecular descriptors. While hypothesis matching scores, per se, only loosely correlate with observed affinities, they appear to be useful molecular descriptors in QSAR equations. The same strategy was also used for discovery of novel Neuraminidase inhibitors [131].

3.4 Addressing Novel Binding Sites: Design of Noncompetitive Inhibitors

The NS5B protein, an RNA-dependent RNA polymerase, a key target for therapeutic intervention against the Hepatitis C virus, was typically blocked by means of designed nucleoside analogs or mimics, binding at the nucleoside binding site. Nucleosides being a major player of the cellular clockwork – therefore, close structural mimics are at high risk of being bound by other receptors and enzymes, leading to potentially serious side effects. A potential allosteric site of NS5B, distinct from the catalytic center, was targeted by means of structure-based design [132]. By virtual screening, the compound library was down-sized from 3.5 million to 119 chemicals. The inhibitory activities of the selected compounds were tested in vitro and confirmed the discovery of low-potency, but interesting noncompetitive NS5B inhibitors.

3.5 Multi-target Pharmacophore Modeling

The ability to choose interaction features to be included in, respectively left out of, a considered pharmacophore model ensures that the same methodology may be successfully used either for designing specific inhibitors of a target, not hitting related macromolecules – by inclusion of site-specific features – or, rather oppositely, for the design of promiscuous binders expected to hit a large panel of targets of a same biological class. In the latter case, the pharmacophore model should be based only on features that are conserved in all the targets of that class: G-coupled protein receptors (GPCRs), kinases, etc. This can be achieved by means of machine learning [133, 134] from a training set of compounds classified in “binders” and “non-binders” of a target family. How many representatives of the target class have to be “hit” by a molecule, and how strong the interaction has to be in order to have this labeled as representative binder of the entire “class,” are matters of empirical choice – and so is the definition of the “target class” (all GPCRs? Rhodopsin-like GPCRs only? etc.) The design of class-specific rather than target-specific inhibitors may be a useful compound library design technique if the library is intended for multiple testing in various bioassays within that class – primary, and likely promiscuous hits having to undergo a specificity-enhancement optimization phase in order to be rendered target-specific. Sometimes, the targeted in vivo activity does not require an absolute specificity with respect to a given target, as in the case of Central Nervous System (CNS) drugs, known to hit multiple GPCRs in the brain, having a therapeutic effect (and, quite often, important side effects) emerging from the subtle interplay of all these interactions. Therefore, in the specific case of CNS activity, it does make sense to go beyond target class-directed library design, to in vivo-effect directed library design. This [135] amounts to searching for a “pharmacophore” guaranteeing a strong affinity for various GPCRs plus an overall pharmacophore pattern with a balanced occurrence of polar and hydrophobic groups, in order to ensure the required pharmacokinetic properties of CNS drugs (passage of the blood–brain barrier, in particular).

If biological rationale suggests that simultaneous inhibition of two or more macromolecules is desirable for curing a given disease, and these targets are structurally related, in the sense of conserved key ligand anchoring points, then the design of specific multi-target drugs can be addressed by means of the largest common pharmacophore encoding the conserved anchoring points. Targeting few specific targets rather than an entire class of related proteins has the advantage of allowing individual pharmacophore extraction, comparison, and manual pruning to a subset of common key interactions. Docking calculations of pharmacophore matching compounds can be a valuable filter of primary virtual hits. A strategy based on the above outlined principles was successfully [136] applied to screen for dual-target inhibitors against both the human leukotriene A4 hydrolase (LTA4H-h) and the human nonpancreatic secretory phospholipase A2 (hnps-PLA2). Three compounds screened from the chemical database MDL Available Chemical Directory were found to inhibit these two enzymes at the 10 μM level.

3.6 Mechanistically Relevant Pharmacophores from Molecular Simulations

An original strategy [137] to block a parasite-specific metabolic pathway in Plasmodium falciparum by means of disruption of the bioactive homodimeric form of a specific kinase (CMK) was based on the construction of a pharmacophore based on the protein–protein (direct or water-mediated) contacts that are thought to be mechanistically relevant for dimerization, according to a molecular dynamics study. This pharmacophore model was used for classical database screening. Using an intensity-fading matrix-assisted laser desorption/ionization time-of-flight mass spectrometry approach, one of the virtual hits was found to interact with CMK. This approach suggests that the empirical pharmacophore search tool can be meaningfully used in complement to rigorous molecular simulations, to provide a convenient, synthetic wrap-up of therein extracted, mechanistically relevant information, and to use it for database screening – something that molecular dynamics per se is not able to do.

4 Conclusions

In light of the wealth of recent, both methodological and applicative, publications devoted to, or based on, pharmacophore modeling, it can be safely concluded that these techniques form, next to (sub)structure-based queries, the backbone of modern chemoinformatics in modern drug design. Reducing the complexity of non-covalent interactions to a set of rules describing the behavior of atom groups characterized by their pharmacophore types turned out to be a fruitful idea, leading to an approach that is both simple enough to be understood and accepted by bench chemists and realistic enough to allow for verifiable and successful predictions. This domain has doubtlessly reached maturity – a plethora of commercial and free software suites supports pharmacophore modeling in all its variants – topological or three-dimensional, structure and ligand-based, overlay-based or overlay-free.

However, on one hand, after browsing through the latest methodological papers in the field, it may be argued that there is still room for technological improvement – in terms of better conformational sampling, more rigorous pharmacophore typing schemes, faster flexible overlay techniques, intelligent selection of potential anchoring points in protein sites, etc. Yet, at second thought, it is not clear at all whether further methodological progress will significantly enhance the performance of pharmacophore-based virtual screening. Pharmacophore modeling is not a fundamental theory of matter, and, as such, intrinsically error-prone. Furthermore, as a series of successive modeling steps – say, in typical ligand-based procedures, conformational sampling, followed by pharmacophore typing, followed by molecular overlay, followed by hypothesis extraction and eventually hypothesis scoring – its overall performance cannot be better than the one supported by the weakest link of this chain. In this sense, specific work to improve one or the other of these aspects may not lead to overall benefits. More fine-grained conformational sampling may reduce the chance to miss the “bioactive” conformer, but may as well “drown” it in a large pool of irrelevant geometries. In ligand-based approaches, conformational sampling will be intrinsically flawed because of the impossibility to prioritize the bioactive conformer over the others when its binding site geometry is not known. The hypothesis that flexible multiple overlays of known actives must forcibly lead to the discovery of bioactive conformers (the ones that stand out as the only geometries that are simultaneously compatible with a meaningful overlay) is not much more than wishful thinking when more than 3 rotatable bonds/ligand are involved. On one hand, the volume of the problem space exponentially explodes as a function of the number of ligands; and on the other, the empirical question of how much strain energy/ligand can be tolerated in order to improve the overlay quality score is incontrovertible and does not accept any physically meaningful answers. Next, better conformational sampling may strictly make no difference if pharmacophore typing is not realistic. Yet, protonation state predictions – and, notably, pKa shifts induced by the actual binding to the receptor – are very difficult.

The importance of the methodological novelties of the latest years is therefore very difficult to assess. They may be intellectually sound and appealing, yet it is not at all sure that they will significantly improve hit rates in pharmacophore-based virtual screens. In a larger context, the absence of reports of pharmacophore-based screening failures renders the objective estimation of the robustness of various methods impossible. This is a fundamental problem of the entire field of computer-aided molecular simulations in drug design, for failures, even if reported, are difficult to interpret – were they imputable to the methodology or were no actives found because the screened data base did not contain any (at least not any that should have been found, within the applicability domain of the method). It is not easy to apply Occam’s razor to “shave” the irrelevant refinement off the latest developments in pharmacophore modeling.

Although purely theoretical pharmacophore constructs and structure–activity relations are still being published per se, without any experimental follow-up (and without introducing any methodological novelties), encouragingly, the majority of the reported applications of pharmacophore modeling were sustained by actual experimental validation of the virtual hits. In the end, it is less important to know whether a method is slightly “better” than another, statistically speaking, than it is to know whether a method can provide experimentally valid responses. In this sense, the independent validation of out-of-box technological solutions by independent research groups – commercial software used as such, without any user-added improvements was repeatedly shown to yield valid results – is excellent news. Unfortunately, failures not being reported, an objective cost/benefit analysis of the use of pharmacophore-based virtual screening in industry and academia cannot be undertaken. Furthermore, it should be kept in mind that any rational drug design undertaken instead of a random screening campaign will – likely – be much more cost-efficient but unlikely to lead to any paradigm breaking discoveries in the field of interest. Scaffold hopping is as “revolutionary” as pharmacophore modeling can get: it may find new chemotypes matching new binding modes. As always, rational approaches are here to refine serendipitous discoveries.


  1. 1.
    IUPAC. (2007) Glossary of Terms used in Medicinal Chemistry, (IUPAC, Ed.), IUPAC.Google Scholar
  2. 2.
    Jorgensen, W. L. (1991) Rusting of the lock and key model for protein-ligand binding. Science 254, 954–955.PubMedCrossRefGoogle Scholar
  3. 3.
    Choudhury, N., Montgomery-Pettitt, B. (2007) The dewetting transition and the hydrophobic effect. Journal of the American Chemical Society 129, 4847–4852.PubMedCrossRefGoogle Scholar
  4. 4.
    Thomson Reuters. (2009) ISI Web of Knowledge, New York.Google Scholar
  5. 5.
    Wang, C., Bradley, P., and Baker, D. (2007) Protein-protein docking with backbone flexibility. Journal of Molecular Biology 373, 503–519.PubMedCrossRefGoogle Scholar
  6. 6.
    De Grandis, V., Bizzarri, A. R., and Cannistraro, S. (2007) Docking study and free energy simulation of the complex between p53 DNA-binding domain and azurin. Journal of Molecular Recognition 20, 215–226.PubMedCrossRefGoogle Scholar
  7. 7.
    Guvench, O., and MacKerell, A. D., Jr. (2008) Comparison of protein force fields for molecular dynamics simulations. Methods in Molecular Biology, 63–88.Google Scholar
  8. 8.
    Ponder, J. W., and Case, D.A. (2003) Force fields for protein simulations. Advances in Protein Chemistry 66, 27–85.PubMedCrossRefGoogle Scholar
  9. 9.
    Parent, B., Kökösy, A., and Horvath, D. (2007) Optimized evolutionary strategies in conformational sampling. Soft Computing 11, 63–79.CrossRefGoogle Scholar
  10. 10.
    Horvath, D. (2008) Topological Pharmacophores. in Chemoinformatics Approaches to Virtual Scrrening (Varnek, A., and Tropsha, A., Eds.), pp 44–72, RCS Publishing, Cambridge, UK.CrossRefGoogle Scholar
  11. 11.
    Bergmann, R., Linusson, A., and Zamora, I. (2007) SHOP: Scaffold HOPping by GRID-based similarity searches. Journal of Medicinal Chemistry 50, 2708–2717.PubMedCrossRefGoogle Scholar
  12. 12.
    Poulain, R., Horvath, D., Bonnet, B., Eckoff, C., Chapelain, B., Bodinier, M-C., and Deprez, B. (2001) From hit to lead. Combining two complementary methods for focused library design application to μ opiate ligands. Journal of Medicinal Chemistry 44, 3378–3390.PubMedCrossRefGoogle Scholar
  13. 13.
    Schlosser, J., and Rarey, M. (2009) Beyond the virtual screening paradigm: Structure-based searching for new lead compounds. Journal of Chemical Information and Modeling 49, 800–809.PubMedCrossRefGoogle Scholar
  14. 14.
    Koppen, H. (2009) Virtual screening – What does it give us? Current Opinion in Drug Discovery & Development 12, 397–407.Google Scholar
  15. 15.
    Sun, H. M. (2008) Pharmacophore-based virtual screening. Current Medicinal Chemistry 15, 1018–1024.PubMedCrossRefGoogle Scholar
  16. 16.
    Sperandio, O., Miteva, M. A., and Villoutreix, B. O. (2008) Combining ligand- and structure-based methods in drug design projects. Current Computer-Aided Drug Design 4, 250–258.CrossRefGoogle Scholar
  17. 17.
    Prinz, H. (2008) How to identify a pharmacophore. Chemistry & Biology 15, 207–208.CrossRefGoogle Scholar
  18. 18.
    Muegge, I. (2008) Synergies of virtual screening approaches. Mini-Reviews in Medicinal Chemistry 8, 927–933.PubMedCrossRefGoogle Scholar
  19. 19.
    Mauser, H., and Guba, W. (2008) Recent developments in de novo design and scaffold hopping. Current Opinion in Drug Discovery & Development 11, 365–374.Google Scholar
  20. 20.
    Green, D. V. S. (2008) Virtual screening of chemical libraries for drug discovery. Expert Opinion on Drug Discovery 3, 1011–1026.CrossRefGoogle Scholar
  21. 21.
    Douguet, D. (2008) Ligand-based approaches in virtual screening. Current Computer-Aided Drug Design 4, 180–190.CrossRefGoogle Scholar
  22. 22.
    Van Drie, J. H. (2007) Computer-aided drug design: The next 20 years. Journal of Computer-Aided Molecular Design 21, 591–601.PubMedCrossRefGoogle Scholar
  23. 23.
    McInnes, C. (2007) Virtual screening strategies in drug discovery. Current Opinion in Chemical Biology 11, 494–502.PubMedCrossRefGoogle Scholar
  24. 24.
    Mason, J. S., Good, A. C., and Martin, E. J. (2001) 3-D pharmacophores in drug discovery. Current Pharmaceutical Design 7, 567–597.PubMedCrossRefGoogle Scholar
  25. 25.
    Güner, O. F. (2000) Pharmacophore Perception, Use and Development in Drug Design, International University Line, La Jolla, CA.Google Scholar
  26. 26.
    Orts, J., Grimm, S. K., Griesinger, C., Wendt, K. U., Bartoschek, S., and Carlomagno, T. (2008) Specific methyl group protonation for the measurement of pharmacophore-specific interligand NOE interactions. Chemistry A European Journal 14, 7517–7520.CrossRefGoogle Scholar
  27. 27.
    Bonachera, F., Parent, B., Barbosa, F., Froloff, N., and Horvath, D. (2006) Fuzzy tricentric pharmacophore fingerprints. 1 – Topological fuzzy pharmacophore triplets and adapted molecular similarity scoring schemes. Journal of Chemical Information and Modeling 46, 2457–2477.PubMedCrossRefGoogle Scholar
  28. 28.
    Guha, R., and Van Drie, J. H. (2008) Structure-activity landscape index: Identifying and quantifying activity cliffs. Journal of Chemical Information and Modeling 48, 646–658.PubMedCrossRefGoogle Scholar
  29. 29.
    Maggiora, G. M. (2006) On outliers and activity cliffs – Why QSAR often disappoints. Journal of Chemical Information and Modeling 46, 1535–1535.PubMedCrossRefGoogle Scholar
  30. 30.
    Bonachera, F., and Horvath, D. (2008) Fuzzy tricentric pharmacophore fingerprints. 2. Application of topological fuzzy pharmacophore triplets in quantitative structure-activity relationships. Journal of Chemical Information and Modeling 48, 409–425.PubMedCrossRefGoogle Scholar
  31. 31.
    Horvath, D., Mao, B., Gozalbes, R., Barbosa, F., and Rogalski, S. L. (2004) Strengths and Limitations of Pharmacophore-Based Virtual Screening. in Chemoinformatics in Drug Discovery. (Oprea, T. I., Ed.), pp 117–137, WILEY-VCH Verlag GmbH, Weinheim.Google Scholar
  32. 32.
    von Korff, M., Freyss, J., and Sander, T. (2008) Flexophore, a new versatile 3D pharmacophore descriptor that considers molecular flexibility. Journal of Chemical Information and Modeling 48, 797–810.CrossRefGoogle Scholar
  33. 33.
    von Korff, M., Freyss, J., and Sander, T. (2008) Comparison of Ligand- and Structure-Based Virtual Screening on the DUD Data Set. in 8th International Conference on Chemical Structures, pp 209–231, Amer Chemical Soc, Noordwijkerhout, Netherlands.Google Scholar
  34. 34.
    Cramer, R. D., Patterson D. E., and Bunce, J. E. (1988) Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. Journal of the American Chemical Society 110, 5959–5967.PubMedCrossRefGoogle Scholar
  35. 35.
    Manallack, D. T. (2008) The use of local surface properties for molecular superimposition. Journal of Molecular Modeling 14, 797–805.PubMedCrossRefGoogle Scholar
  36. 36.
    Sperandio, O., Souaille, M., Delfaud, F., Miteva, M. A., and Villoutreix, B. O. (2009) MED-3DMC: A new tool to generate 3D conformation ensembles of small molecules with a Monte Carlo sampling of the conformational space. European Journal of Medicinal Chemistry 44, 1405–1409.PubMedCrossRefGoogle Scholar
  37. 37.
    Liu, X. F., Bai, F., Ouyang, S. S., Wang, X. C., Li, H. L., and Jiang, H. L. (2009) Cyndi: A multi-objective evolution algorithm based method for bioactive molecular conformational generation. BMC Bioinformatics 10, 14.CrossRefGoogle Scholar
  38. 38.
    Li, J., Ehlers, T., Sutter, J., Varma-O’Brien, S., and Kirchmair, J. (2007) CAESAR: A new conformer generation algorithm based on recursive buildup and local rotational symmetry consideration. Journal of Chemical Information and Modeling 47, 1923–1932.PubMedCrossRefGoogle Scholar
  39. 39.
    Takagi, T., Amano, M., and Tomimoto, M. (2009) Novel method for the evaluation of 3D conformation generators. Journal of Chemical Information and Modeling 49, 1377–1388.PubMedCrossRefGoogle Scholar
  40. 40.
    Perola, E., and Charifson, P. S. (2004) Conformational analysis of drug-like molecules bound to proteins: An extensive study of ligand reorganization upon binding. Journal of Medicinal Chemistry 47, 2499–2510.PubMedCrossRefGoogle Scholar
  41. 41.
    Böhm, H. J. (1992) The Computer Program LUDI: A new method for the de novo design of enzyme inhibitors. Journal of Computer-Aided Molecular Design 6, 61–78.PubMedCrossRefGoogle Scholar
  42. 42.
    Gillet, V., Johnson, A. P., Mata, P., Sike, S., and Williams, P. (1993) SPROUT: A program for structure generation. Journal of Computer-Aided Molecular Design 7, 127–153.PubMedCrossRefGoogle Scholar
  43. 43.
    Murray, C. W., Clark, D. E., Auton, T. R., Firth, M. A., Li, J., Sykes, R. A., Waszkowycz, B., Westhead, D. R., and Young, S. C. (1997) PRO_SELECT: Combining structure-based drug design and combinatorial chemistry for rapid lead discovery. 1. Technology. Journal of Computer-Aided Molecular Design 11, 193–207.PubMedCrossRefGoogle Scholar
  44. 44.
    Tintori, C., Corradi, V., Magnani, M., Manetti, F., and Botta, M. (2008) Targets looking for drugs: A multistep computational protocol for the development of structure-based pharmacophores and their applications for hit discovery. Journal of Chemical Information and Modeling 48, 2166–2179.PubMedCrossRefGoogle Scholar
  45. 45.
    Goodford, P. J. (1985) A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. Journal of Medicinal Chemistry 28, 849–857.PubMedCrossRefGoogle Scholar
  46. 46.
    Bowman, A. L., Lerner, M. G., and Carlson, H. A. (2007) Protein flexibility and species specificity in structure-based drug discovery: Dihydrofolate reductase as a test system. Journal of the American Chemical Society 129, 3634–3640.PubMedCrossRefGoogle Scholar
  47. 47.
    Damm, K. L., and Carlson, H. A. (2007) Exploring experimental sources of multiple protein conformations in structure-based drug design. Journal of the American Chemical Society 129, 8225–8235.PubMedCrossRefGoogle Scholar
  48. 48.
    Jayachandran, G., Shirts, M. R., Park, S., and Pande, V. S. (2006) Parallelized over parts computation of absolute binding free energy with docking and molecular dynamics. The Journal of Chemical Physics 125, 84901–84905.CrossRefGoogle Scholar
  49. 49.
    Barillari, C., Marcou, G., and Rognan, D. (2008) Hot-spots-guided receptor-based pharmacophores (HS-Pharm): A knowledge-based approach to identify ligand-anchoring atoms in protein cavities and prioritize structure-based pharmacophores. Journal of Chemical Information and Modeling 48, 1396–1410.PubMedCrossRefGoogle Scholar
  50. 50.
    Martin, E. J., and Sullivan, D. C. (2008) Surrogate AutoShim: Predocking into a universal ensemble kinase receptor for three dimensional activity prediction, very quickly, without a crystal structure. Journal of Chemical Information and Modeling 48, 873–881.PubMedCrossRefGoogle Scholar
  51. 51.
    Zou, J., Xie, H. Z., Yang, S. Y., Chen, J. J., Ren, J. X., and Wei, Y. Q. (2008) Towards more accurate pharmacophore modeling: Multicomplex-based comprehensive pharmacophore map and most-frequent-feature pharmacophore model of CDK2. Journal of Molecular Graphics 27, 430–438.CrossRefGoogle Scholar
  52. 52.
    Ebalunode, J. O., Ouyang, Z., Liang, J., and Zheng, W. (2008) Novel approach to structure-based pharmacophore search using computational geometry and shape matching techniques. Journal of Chemical Information and Modeling 48, 889–901.PubMedCrossRefGoogle Scholar
  53. 53.
    Horvath, D. (2001) ComPharm – Automated comparative analysis of pharmacophoric patterns and derived QSAR approaches, novel tools in high throughput drug discovery. A proof of concept study applied to farnesyl protein transferase inhibitor design. in QSPR/QSAR Studies by Molecular Descriptors (Diudea, M. V., Ed.), pp 395–439., Nova Science Publishers, Inc, New York.Google Scholar
  54. 54.
    Landrum, G. A., Penzotti, J. E., and Putta, S. (2006) Feature-map vectors: A new class of informative descriptors for computational drug discovery. Journal of Computer-Aided Molecular Design 20, 751–762.PubMedCrossRefGoogle Scholar
  55. 55.
    Totrov, M. (2008) Atomic property fields: Generalized 3D pharmacophoric potential for automated ligand superposition, pharmacophore elucidation and 3D QSAR. Chemical Biology & Drug Design 71, 15–27.CrossRefGoogle Scholar
  56. 56.
    Putta, S., Landrum, G. A., and Penzotti, J. E. (2005) Conformation mining: An algorithm for finding biologically relevant conformations. Journal of Medicinal Chemistry 48, 3313–3318.PubMedCrossRefGoogle Scholar
  57. 57.
    Taminau, J., Thijs, G., and De Winter, H. (2008) Pharao: Pharmacophore alignment and optimization. Journal of Molecular Graphics & Modelling 27, 161–169.CrossRefGoogle Scholar
  58. 58.
    Todorov, N. P., Alberts, I. L., de Esch, I. J. P., and Dean, P. M. (2007) QUASI: A novel method for simultaneous superposition of multiple flexible ligands and virtual screening using partial similarity. Journal of Chemical Information and Modeling 47, 1007–1020.PubMedCrossRefGoogle Scholar
  59. 59.
    Wolber, G., Dornhofer, A. A., and Langer, T. (2006) Efficient overlay of small organic molecules using 3D pharmacophores. Journal of Computer-Aided Molecular Design 20, 773–788.PubMedCrossRefGoogle Scholar
  60. 60.
    Bandyopadhyay, D., and Agrafiotis, D. K. (2008) A self-organizing algorithm for molecular alignment and pharmacophore development. Journal of Computational Chemistry 29, 965–982.PubMedCrossRefGoogle Scholar
  61. 61.
    Cottrell, S. J., Gillet, V. J., and Taylor, R. (2006) Incorporating partial matches within multiobjective pharmacophore identification. Journal of Computer-Aided Molecular Design 20, 735–749.PubMedCrossRefGoogle Scholar
  62. 62.
    Nandigam, R. K., Evans, D. A., Erickson, J. A., Kim, S., and Sutherland, J. J. (2008) Predicting the Accuracy of ligand overlay methods with random forest models. Journal of Chemical Information and Modeling 48, 2386–2394.PubMedCrossRefGoogle Scholar
  63. 63.
    Hähnke, V., Hofmann, B., Grgat, T., Proschak, E., Steinhilber, D., and Schneider, G. (2009) PhAST: Pharmacophore alignment search tool. Journal of Computational Chemistry 30, 761–771.PubMedCrossRefGoogle Scholar
  64. 64.
    Rafael Gozalbes, F. B., Nicolaï, E., Horvath, D., Froloff, N. (2009) Development and validation of a pharmacophore-based QSAR model for the prediction of CNS activity. ChemMedChem 4, 204–209.PubMedCrossRefGoogle Scholar
  65. 65.
    Mason, J. S., Morize, I., Menard, P. R., Cheney, D. L., Hulme, C., Labaudiniere, R. F. (1998) New 4-point pharmacophore method for molecular similarity and diversity applications: Overview of the method and applications, including a novel approach to the design of combinatorial libraries containing privileged substructures. Journal of Medicinal Chemistry 38, 144–150.Google Scholar
  66. 66.
    Shepphird, J. K., and Clark, R. D. (2006) A marriage made in torsional space: Using GALAHAD models to drive pharmacophore multiplet searches. Journal of Computer-Aided Molecular Design 20, 763–771.PubMedCrossRefGoogle Scholar
  67. 67.
    Sciabola, S., Morao, I., and de Groot, M. J. (2007) Pharmacophoric fingerprint method (TOPP) for 3D-QSAR modeling: Application to CYP2D6 metabolic stability. Journal of Chemical Information and Modeling 47, 76–84.PubMedCrossRefGoogle Scholar
  68. 68.
    Watson, P. (2008) Naive Bayes classification using 2D pharmacophore feature triplet vectors. Journal of Chemical Information and Modeling 48, 166–178.PubMedCrossRefGoogle Scholar
  69. 69.
    Askjaer, S., and Langgard, M. (2008) Combining pharmacophore fingerprints and PLS-discriminant analysis for virtual screening and SAR elucidation. Journal of Chemical Information and Modeling 48, 476–488.PubMedCrossRefGoogle Scholar
  70. 70.
    Podolyan, Y., and Karypis, G. (2009) Common pharmacophore identification using frequent clique detection algorithm. Journal of Chemical Information and Modeling 49, 13–21.PubMedCrossRefGoogle Scholar
  71. 71.
    Sperandio, O., Andrieu, O., Miteva, M. A., Vo, M. Q., Souaille, M., Delfaud, F., and Villoutreix, B. O. (2007) MED-SuMoLig: A new ligand-based screening tool for efficient scaffold hopping. Journal of Chemical Information and Modeling 47, 1097–1110.PubMedCrossRefGoogle Scholar
  72. 72.
    Buttingsrud, B., King, R. D., and Alsberg, B. K. (2007) An alignment-free methodology for modelling field-based 3D-structure activity relationships using inductive logic programming. Journal of Chemometrics 21, 509–519.CrossRefGoogle Scholar
  73. 73.
    Tsunoyama, K., Amini, A., Sternberg, M. J. E., and Muggleton, S. H. (2008) Scaffold hopping in drug discovery using inductive logic programming. Journal of Chemical Information and Modeling 48, 949–957.PubMedCrossRefGoogle Scholar
  74. 74.
    Schneider, G., Neidhart, W., Giller, T., and Schmid, G. (1999) “Scaffold-Hopping” by topological pharmacophore search: A contribution to virtual screening. Angewandte Chemie 38, 2894–2896.PubMedCrossRefGoogle Scholar
  75. 75.
    Zhu, F. Q., and Agrafiotis, D. K. (2007) Recursive distance partitioning algorithm for common pharmacophore identification. Journal of Chemical Information and Modeling 47, 1619–1625.PubMedCrossRefGoogle Scholar
  76. 76.
    Kirchmair, J., Ristic, S., Eder, K., Markt, P., Wolber, G., Laggner, C., and Langer, T. (2007) Fast and efficient in silico 3D screening: Toward maximum computational efficiency of pharmacophore-based and shape-based approaches. Journal of Chemical Information and Modeling 47, 2182–2196.PubMedCrossRefGoogle Scholar
  77. 77.
    Güner, O., Clement, O., and Kurogi, Y. (2004) Pharmacophore modeling and three dimensional database searching for drug design using catalyst: Recent advances. Current Medicinal Chemistry 11, 2991–3005.PubMedCrossRefGoogle Scholar
  78. 78.
    Kurogi, Y., and Güner, O. (2001) Pharmacophore modeling and threedimensional database searching for drug design using catalyst. Current Medicinal Chemistry 8, 1035–1055.PubMedCrossRefGoogle Scholar
  79. 79.
    Matter, H., and Pötter, T. (1999) Comparing 3D pharmacophore triplets and 2D fingerprints for selecting diverse compound subsets. Journal of Chemical Information and Modeling 39, 1211–1225.CrossRefGoogle Scholar
  80. 80.
    Horvath, D., and Jeandenans, C. (2003) Neighborhood behavior of in silico structural spaces with respect to in vitro activity spaces – A benchmark for neighborhood behavior assessment of different in silico similarity metrics. Journal of Chemical Information and Computer Sciences 43, 691–698.PubMedCrossRefGoogle Scholar
  81. 81.
    Fox, P. C., Wolohan, P. R. N., Abrahamian, E., and Clark, R. D. (2008) Parameterization and conformational sampling effects in pharmacophore multiplet searching. Journal of Chemical Information and Modeling 48, 2326–2334.PubMedCrossRefGoogle Scholar
  82. 82.
    Nisius, B., Vogt, M., and Bajorath, J. (2009) Development of a fingerprint reduction approach for Bayesian similarity searching based on Kullback-Leibler divergence analysis. Journal of Chemical Information and Modeling 49, 1347–1358.PubMedCrossRefGoogle Scholar
  83. 83.
    Kirchmair, J., Wolber, G., Laggner, C., and Langer, T. (2006) Comparative performance assessment of the conformational model generators omega and catalyst: A large-scale survey on the retrieval of protein-bound ligand conformations. Journal of Chemical Information and Modeling 46, 1848–1861.PubMedCrossRefGoogle Scholar
  84. 84.
    Kirchmair, J., Laggner, C., Wolber, G., and Langer, T. (2005) Comparative analysis of protein-bound ligand conformations with respect to catalyst’s conformational space subsampling algorithms. Journal of Chemical Information and Modeling 45, 422–430.PubMedCrossRefGoogle Scholar
  85. 85.
    Chen, I. J., and Foloppe, N. (2008) Conformational sampling of druglike molecules with MOE and catalyst: Implications for pharmacophore modeling and virtual screening. Journal of Chemical Information and Modeling 48, 1773–1791.PubMedCrossRefGoogle Scholar
  86. 86.
    Halperin, I., Ma, B., Wolfson, H., and Nussinov, R. (2002) Principles of docking: An overview of search algorithms and a guide to scoring functions. Proteins 47, 409–443.PubMedCrossRefGoogle Scholar
  87. 87.
    Jewsbury, P. J., Taylor, R. D., and Essex, J. W. (2002) A review of protein-small molecule docking methods. Journal of Computer-Aided Molecular Design 16, 151–166.PubMedCrossRefGoogle Scholar
  88. 88.
    Rarey, M., Claussen, H., Buning, C., and Lengauer, T. (2001) FlexE: Efficient molecular docking considering protein structure variations. Journal of Molecular Biology 308, 377–395.PubMedCrossRefGoogle Scholar
  89. 89.
    Todd, J., Ewing, A., and Kuntz, I. D. (1998) Critical evaluation of search algorithms for automated molecular docking and database screening. Journal of Computational Chemistry 18, 1175–1189.Google Scholar
  90. 90.
    Muthas, D., Sabnis, Y. A., Lundborg, M., and Karlen, A. (2008) Is it possible to increase hit rates in structure-based virtual screening by pharmacophore filtering? An investigation of the advantages and pitfalls of post-filtering. Journal of Molecular Graphics 26, 1237–1251.CrossRefGoogle Scholar
  91. 91.
    Brewerton, S. C. (2008) The use of protein-ligand interaction fingerprints in docking. Current Opinion in Drug Discovery & Development 11, 356–364.Google Scholar
  92. 92.
    Venhorst, J., Nunez, S., Terpstra, J. W., and Kruse, C. G. (2008) Assessment of scaffold hopping efficiency by use of molecular interaction fingerprints. Journal of Medicinal Chemistry 51, 3222–3229.PubMedCrossRefGoogle Scholar
  93. 93.
    Marcou, G., and Rognan, D. (2007) Optimizing fragment and scaffold docking by use of molecular interaction fingerprints. Journal of Chemical Information and Modeling 47, 195–207.PubMedCrossRefGoogle Scholar
  94. 94.
    Baroni, M., Cruciani, G., Sciabola, S., Perruccio, F., and Mason, J. S. (2007) A common reference framework for analyzing/comparing proteins and ligands. Fingerprints for ligands and proteins (FLAP): Theory and application. Journal of Chemical Information and Modeling 47, 279–294.PubMedCrossRefGoogle Scholar
  95. 95.
    Horvath, D., Marcou, G., and Varnek, A. (2009) Predicting the predictability: A unified approach to the applicability domain problem of QSAR models. Journal of Chemical Information and Modeling 49, 1762–1776.CrossRefGoogle Scholar
  96. 96.
    Tetko, I. V., Sushko, I., Pandey, A. K., Zhu, H., Tropsha, A., Papa, E., Oberg, T., Todeschini, R., Fourches, D., and Varnek, A. (2008) Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: Focusing on applicability domain and overfitting by variable selection. Journal of Chemical Information and Modeling 48, 1733–1746.PubMedCrossRefGoogle Scholar
  97. 97.
    Eriksson, L., Jaworska, J., Worth, A. P., Cronin, M., McDowell, R. M., and Gramatica, P. (2003) Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. QSAR & Combinatorial Science 111, 1361–1375.Google Scholar
  98. 98.
    Ma, X. H., Jia, J., Zhu, F., Xue, Y., Li, Z. R., and Chen, Y. Z. (2009) Comparative analysis of machine learning methods in ligand-based virtual screening of large compound libraries. Combinatorial Chemistry & High Throughput Screening 12, 344–357.CrossRefGoogle Scholar
  99. 99.
    Klebe, G. (2008) Understanding QSAR: Do we always use the correct structural models to establish affinity correlation?Google Scholar
  100. 100.
    Gonzalez, M. P., Teran, C., Saiz-Urra, L., and Teijeira, M. (2008) Variable selection methods in QSAR: An overview. Current Topics in Medicinal Chemistry 8, 1606–1627.PubMedCrossRefGoogle Scholar
  101. 101.
    Nair, P. C., and Sobhia, M. E. (2008) Fingerprint directed scaffold hopping for identification of CCR2 antagonists. Journal of Chemical Information and Modeling 48, 1891–1902.PubMedCrossRefGoogle Scholar
  102. 102.
    Mascarenhas, N. M., and Ghoshal, N. (2008) An efficient tool for identifying inhibitors based on 3D-QSAR and docking using feature-shape pharmacophore of biologically active conformation – A case study with CDK2/CyclinA. European Journal of Medicinal Chemistry 43, 2807–2818.PubMedCrossRefGoogle Scholar
  103. 103.
    Vadivelan, S., Sinha, B. N., Tajne, S., and Jagarlapudi, S. (2009) Fragment and knowledge-based design of selective GSK-3 beta inhibitors using virtual screening models. European Journal of Medicinal Chemistry 44, 2361–2371.PubMedCrossRefGoogle Scholar
  104. 104.
    Dong, A. G., Huo, J. F., Gao, Q. Z., Zhao, K., and Wei, J. (2009) A three-dimensional pharmacophore model for RXR alpha agonists. Journal of Molecular Structure 920, 252–263.CrossRefGoogle Scholar
  105. 105.
    Xie, Q. Q., Xie, H. Z., Ren, J. X., Li, L. L., and Yang, S. Y. (2009) Pharmacophore modeling studies of type I and type II kinase inhibitors of Tie2. Journal of Molecular Graphics 27, 751–758.CrossRefGoogle Scholar
  106. 106.
    Andrade, C. H., Pasqualoto, K. F. M., Ferreira, E. I., and Hopfinger, A. J. (2009) Rational design and 3D-pharmacophore mapping of 5′-thiourea-substituted alpha-thymidine analogues as mycobacterial TMPK inhibitors. Journal of Chemical Information and Modeling 49, 1070–1078.PubMedCrossRefGoogle Scholar
  107. 107.
    Brizzi, A., Brizzi, V., Cascio, M. G., Corelli, F., Guida, F., Ligresti, A., Maione, S., Martinelli, A., Pasquini, S., Tuccinardi, T., and Di Marzo, V. (2009) New resorcinol-anandamide “hybrids” as potent cannabinoid receptor ligands endowed with antinociceptive activity in vivo. Journal of Medicinal Chemistry 52, 2506-2514.PubMedCrossRefGoogle Scholar
  108. 108.
    Yang, H. Y., Shen, Y., Chen, J. H., Jiang, Q. F., Leng, Y., and Shen, J. H. (2009) Structure-based virtual screening for identification of novel 11 beta-HSD1 inhibitors. European Journal of Medicinal Chemistry 44, 1167–1171.PubMedCrossRefGoogle Scholar
  109. 109.
    Accelrys Software, I. (2006) Catalyst, 4.9 ed., San Diego.Google Scholar
  110. 110.
    Wolber, G., and Langer, T. (2005) LigandScout: 3-D pharmacophores derived from protein-bound ligands and their use as virtual screening filters. Journal of Chemical Information and Modeling 45, 160–169.PubMedCrossRefGoogle Scholar
  111. 111.
    Schrödinger, L. (2005) Glide, New York.Google Scholar
  112. 112.
    Tanrikulu, Y., Rau, O., Schwarz, O., Proschak, E., Siems, K., Muller-Kuhrt, L., Schubert-Zsilavecz, M., and Schneider, G. (2009) Structure-based pharmacophore screening for natural-product-derived PPAR gamma agonists. Chembiochem 10, 75–78.PubMedCrossRefGoogle Scholar
  113. 113.
    Tanrikulu, Y., Nietert, M., Scheffer, U., Proschak, E., Grabowski, K., Schneider, P., Weidlich, M., Karas, M., Goebel, M., and Schneider, G. (2007) Scaffold hopping by “fuzzy” pharmacophores and its application to RNA targets. Chembiochem 8, 1932–1936.PubMedCrossRefGoogle Scholar
  114. 114.
    Ahmed, A., Choo, H., Cho, Y. S., Park, W. K., and Pae, A. N. (2009) Identification of novel serotonin 2C receptor ligands by sequential virtual screening. Bioorganic & Medicinal Chemistry 17, 4559–4568.CrossRefGoogle Scholar
  115. 115.
    Raghuraman, A., Liang, A. Y., Krishnasamy, C., Lauck, T., Gunnarsson, G. T., and Desai, U. R. (2009) On designing non-saccharide, allosteric activators of antithrombin. European Journal of Medicinal Chemistry 44, 2626–2631.PubMedCrossRefGoogle Scholar
  116. 116.
    Thiry, A., Ledecq, M., Cecchi, A., Frederick, R., Dogne, J. M., Supuran, C. T., Wouters, J., and Masereel, B. (2009) Ligand-based and structure-based virtual screening to identify carbonic anhydrase IX inhibitors. Bioorganic & Medicinal Chemistry 17, 553–557.CrossRefGoogle Scholar
  117. 117.
    (2005) MOE (Molecular Operating Environment), 2005.06 ed., Chemical Computing Group, Inc., Montreal.Google Scholar
  118. 118.
    Bennett, B. C., Wan, Q., Ahmad, M. F., Langan, P., and Dealwis, C. G. (2009) X-ray structure of the ternary MTX.NADPH complex of the anthrax dihydrofolate reductase: A pharmacophore for dual-site inhibitor design. Journal of Structural Biology 166, 162–171.PubMedCrossRefGoogle Scholar
  119. 119.
    Tripos, I. (2007) Sybyl, 8.0 ed., St. Louis, MO.Google Scholar
  120. 120.
    Mustata, G., Follis, A. V., Hammoudeh, D. I., Metallo, S. J., Wang, H. B., Prochownik, E. V., Lazo, J. S., and Bahar, I. (2009) Discovery of novel Myc-Max heterodimer disruptors with a three-dimensional pharmacophore model. Journal of Medicinal Chemistry 52, 1247–1250.PubMedCrossRefGoogle Scholar
  121. 121.
    Richmond, N. J., Abrams, C. A., Wolohan, P. R. N., Abrahamian, E., Willett, P., and Clark, R. D. (2006) GALAHAD: 1. Pharmacophore identification by hypermolecular alignment of ligands in 3D. Journal of Computer-Aided Molecular Design 20, 567–587.PubMedCrossRefGoogle Scholar
  122. 122.
    Perez-Nueno, V. I., Pettersson, S., Ritchie, D. W., Borrell, J. I., and Teixido, J. (2009) Discovery of novel HIV entry inhibitors for the CXCR4 receptor by prospective virtual screening. Journal of Chemical Information and Modeling 49, 810–823.PubMedCrossRefGoogle Scholar
  123. 123.
    Accelrys Software, I. (2007) Discovery Studio, 2.0 ed., San Diego, CA.Google Scholar
  124. 124.
    Lin, J., and Clark, T. (2005) An analytical, variable resolution, complete description of static molecules and their intermolecular binding properties. Journal of Chemical Information and Modeling 45, 1010–1016.PubMedCrossRefGoogle Scholar
  125. 125.
    Grant, A. J., and Pickup, B. T. (1996) A fast method of molecular shape comparison: A simple application of a Gaussian description of molecular shape. Journal of Computational Chemistry 17, 1653–1659.CrossRefGoogle Scholar
  126. 126.
    Morris, G. M. (2007) AutoDock.Google Scholar
  127. 127.
    Verdonk, M. L., Cole, J. C., Hartshorn, M. J., Murray, C. W., and Taylor, R. D. (2003) Improved protein-ligand docking using GOLD. Proteins 52, 609–623.PubMedCrossRefGoogle Scholar
  128. 128.
    McGann, M. R., Almond, H. R., Nicholls, A., Grant, J. A., and Brown, F. K. (2003) Gaussian docking functions. Biopolymers 68, 76–90.PubMedCrossRefGoogle Scholar
  129. 129.
    Ritchie, D. W., and Kemp, G. J. L. (2000) Protein docking using spherical polar Fourier correlations. Proteins 39, 178–194.PubMedCrossRefGoogle Scholar
  130. 130.
    Taha, M. O., Dahabiyeh, L. A., Bustanji, Y., Zalloum, H., and Saleh, S. (2008) Combining ligand-based pharmacophore modeling, quantitative structure-activity relationship analysis and in silico screening for the discovery of new potent hormone sensitive lipase inhibitors. Journal of Medicinal Chemistry 51, 6478–6494.PubMedCrossRefGoogle Scholar
  131. 131.
    Abu Hammad, A. M., and Taha, M. O. (2009) Pharmacophore modeling, quantitative structure-activity relationship analysis, and shape-complemented in silico screening allow access to novel influenza neuraminidase inhibitors. Journal of Chemical Information and Modeling 49, 978–996.PubMedCrossRefGoogle Scholar
  132. 132.
    Ryu, K., Kim, N. D., Choi, S. I., Han, C. K., Yoon, J. H., No, K. T., Kim, K. H., and Seong, B. L. (2009) Identification of novel inhibitors of HCV RNA-dependent RNA polymerase by pharmacophore-based virtual screening and in vitro evaluation. Bioorganic & Medicinal Chemistry 17, 2975–2982.CrossRefGoogle Scholar
  133. 133.
    Rolland, C., Gozalbes, R., Nicolai, E., Paugam, M. F., Coussy, L., Barbosa, F., Horvath, D., and Revah, F. (2005) G-protein-coupled receptor affinity prediction based on the use of a profiling dataset: QSAR design, synthesis, and experimental validation. Journal of Medicinal Chemistry 48, 6563–6574.PubMedCrossRefGoogle Scholar
  134. 134.
    Gozalbes, R., Rolland C., Nicolaï, E., Paugam M.-F., Coussy L., Horvath D., Barbosa F., Mao B., Revah F., and Froloff, N. (2005) QSAR strategy and experimental validation for the development of a GPCR focused library. QSAR & Combinatorial Science 24, 508–516.CrossRefGoogle Scholar
  135. 135.
    Gozalbes, R., Barbosa, F., Nicolai, E., Horvath, D., and Froloff, N. (2009) Development and validation of a pharmacophore-based QSAR model for the prediction of CNS activity. ChemMedChem 4, 204–209.PubMedCrossRefGoogle Scholar
  136. 136.
    Wei, D. G., Jiang, X. L., Zhou, L., Chen, J., Chen, Z., He, C., Yang, K., Liu, Y., Pei, J. F., and Lai, L. H. (2008) Discovery of multitarget inhibitors by combining molecular docking with common pharmacophore matching. Journal of Medicinal Chemistry 51, 7882–7888.PubMedCrossRefGoogle Scholar
  137. 137.
    Gimenez-Oya, V., Villacanas, O., Fernandez-Busquets, X., Rubio-Martinez, J., and Imperial, S. (2009) Mimicking direct protein-protein and solvent-mediated interactions in the CDP-methylerythritol kinase homodimer: A pharmacophore-directed virtual screening approach. Journal of Molecular Modeling 15, 997–1007.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Dragos Horvath
    • 1
  1. 1.Laboratoire d’InfoChime, UMR 7177Université de Strasbourg – CNRSInstitut de ChimieStrasbourgFrance

Personalised recommendations