Abstract
Molecular descriptors encode a variety of molecular representations for computer-assisted drug discovery. Here, we focus on the Weighted Holistic Atom Localization and Entity Shape (WHALES) descriptors, which were originally designed for scaffold hopping from natural products to synthetic molecules. WHALES descriptors capture molecular shape and partial charges simultaneously. We introduce the key aspects of the WHALES concept and provide a step-by-step guide on how to use these descriptors for virtual compound screening and scaffold hopping. The results presented can be reproduced by using the code freely available from URL: github.com/ETHmodlab/scaffold_hopping_whales.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Schneider G, Neidhart W, Giller T et al (1999) “Scaffold-Hopping” by topological pharmacophore search: a contribution to virtual screening. Angew Chem Int Ed 38:2894–2896
Teuber L, Watjen F, Jensen L (1999) Ligands for the benzodiazepine binding site-a survey. Curr Pharm Des 5:317–344
Patel S, Harris SF, Gibbons P et al (2015) Scaffold-hopping and structure-based discovery of potent, selective, and brain penetrant N-(1H-Pyrazol-3-yl)pyridin-2-amine inhibitors of dual leucine zipper kinase (DLK, MAP3K12). J Med Chem 58:8182–8199
Jiang Z, Liu N, Dong G et al (2014) Scaffold hopping of sampangine: discovery of potent antifungal lead compound against Aspergillus fumigatus and Cryptococcus neoformans. Bioorg Med Chem Lett 24:4090–4094
Olson GL, Bolin DR, Bonner MP et al (1993) Concepts and progress in the development of peptide mimetics. J Med Chem 36:3039–3049
Friedrich L, Rodrigues T, Neuhaus CS et al (2016) From complex natural products to simple synthetic mimetics by computational de novo design. Angew Chem Int Ed 55:6789–6792
Tresadern G, Cid JM, Macdonald GJ et al (2010) Scaffold hopping from pyridones to imidazo[1,2-a]pyridines. New positive allosteric modulators of metabotropic glutamate 2 receptor. Bioorg Med Chem Lett 20:175–179
Yang H, Sun L, Wang Z et al (2018) ADMETopt: a web server for ADMET optimization in drug design via scaffold hopping. J Chem Inf Model 58:2051–2056
Böhm H-J, Flohr A, Stahl M (2004) Scaffold hopping. Drug Discov Today Technol 1:217–224
Taylor RD, MacCoss M, Lawson ADG (2014) Rings in drugs. J Med Chem 57:5845–5859
Hessler G, Baringhaus K-H (2010) The scaffold hopping potential of pharmacophores. Drug Discov Today Technol 7:e263–e269
Lauri G, Bartlett PA (1994) CAVEAT: a program to facilitate the design of organic molecules. J Comput Aided Mol Des 8:51–66
Maass P, Schulz-Gasch T, Stahl M et al (2007) Recore: a fast and versatile method for scaffold hopping based on small molecule crystal structure conformations. J Chem Inf Model 47:390–399
Bergmann R, Linusson A, Zamora I (2007) SHOP: scaffold HOPping by GRID-based similarity searches. J Med Chem 50:2708–2717
Zhang Q, Muegge I (2006) Scaffold hopping through virtual screening using 2D and 3D similarity descriptors: ranking, voting, and consensus scoring. J Med Chem 49:1536–1548
Vogt M, Stumpfe D, Geppert H et al (2010) Scaffold hopping using two-dimensional fingerprints: true potential, black magic, or a hopeless endeavor? Guidelines for virtual screening. J Med Chem 53:5707–5715
Merk D, Grisoni F, Friedrich L et al (2018) Scaffold hopping from synthetic RXR modulators by virtual screening and de novo design. Med Chem Comm 9:1289–1292
Johnson MA, Maggiora GM (1990) Concepts and applications of molecular similarity. Wiley
Maggiora G, Vogt M, Stumpfe D et al (2014) Molecular similarity in medicinal chemistry. J Med Chem 57:3186–3204
Schneider G, Schneider P, Renner S (2006) Scaffold-hopping: how far can you jump? QSAR Comb Sci 25:1162–1171
Stumpfe D, Hu Y, Dimova D et al (2014) Recent progress in understanding activity cliffs and their utility in medicinal chemistry. J Med Chem 57:18–28
Maggiora GM (2006) On outliers and activity cliffs – why QSAR often disappoints. J Chem Inf Model 46:1535–1535
Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics: volume I: alphabetical listing / volume II: appendices, references. John Wiley & Sons
Bajorath J (2001) Selected concepts and investigations in compound classification, molecular descriptor analysis, and virtual screening. J Chem Inf Comput Sci 41:233–245
Pozzan A (2006) Molecular descriptors and methods for ligand based virtual high throughput screening in drug discovery. Curr Pharm Des 12:2099–2110
Willett P (2006) Similarity-based virtual screening using 2D fingerprints. Drug Discov Today 11:1046–1053
Cereto-Massagué A, Ojeda MJ, Valls C et al (2015) Molecular fingerprint similarity search in virtual screening. Virtual Screen 71:58–63
Grisoni F, Consonni V, Todeschini R (2018) Impact of molecular descriptors on computational models. In: Brown JB (ed) Computational Chemogenomics. Springer, New York, NY, pp 171–209
Arimoto R, Prasad M-A, Gifford EM (2005) Development of CYP3A4 inhibition models: comparisons of machine-learning techniques and molecular descriptors. J Biomol Screen 10:197–205
Lv W, Xue Y (2010) Prediction of acetylcholinesterase inhibitors and characterization of correlative molecular descriptors by machine learning methods. Eur J Med Chem 45:1167–1172
Redkar S, Mondal S, Joseph A et al (2020) A machine learning approach for drug-target interaction prediction using wrapper feature selection and class balancing. Mol Inf 39:1900062. https://doi.org/10.1002/minf.201900062
Zhang H, Liu C-T, Mao J et al (2020) Development of novel in silico prediction model for drug-induced ototoxicity by using naïve Bayes classifier approach. Toxicol In Vitro 65:104812
Grisoni F, Ballabio D, Todeschini R et al (2018) Molecular descriptors for structure–activity applications: a hands-on approach. In: Nicolotti O (ed) Computational toxicology: methods and protocols. Springer, New York, NY, pp 3–53
Willett P (2014) The calculation of molecular structural similarity: principles and practice. Mol Inf 33:403–413
Todeschini R, Ballabio D, Consonni V (2015) Distances and other dissimilarity measures in chemometrics. In: Encyclopedia of Analytical Chemistry. John Wiley & Sons, Ltd
Grisoni F, Reker D, Schneider P et al (2017) Matrix-based molecular descriptors for prospective virtual compound screening. Mol Inf 36:1600091
Rivera-Borroto OM, Marrero-Ponce Y, García-de la Vega JM et al (2011) Comparison of combinatorial clustering methods on pharmacological data sets represented by machine learning-selected real molecular descriptors. J Chem Inf Model 51:3036–3049
Li H, Yap CW, Ung CY et al (2005) Effect of selection of molecular descriptors on the prediction of blood−brain barrier penetrating and nonpenetrating agents by statistical learning methods. J Chem Inf Model 45:1376–1384
Schneider P, Schneider G (2016) De novo design at the edge of chaos. J Med Chem 59:4077–4086
Grisoni F, Consonni V, Ballabio D (2019) Machine learning consensus to predict the binding to the androgen receptor within the CoMPARA project. J Chem Inf Model 59:1839–1848
Medina-Franco JL, Martínez-Mayorga K, Bender A et al (2009) Characterization of activity landscapes using 2D and 3D similarity methods: consensus activity cliffs. J Chem Inf Model 49:477–491
Tetko IV, Sushko I, Pandey AK et al (2008) Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection. J Chem Inf Model 48:1733–1746
Zhu H, Tropsha A, Fourches D et al (2008) Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis. J Chem Inf Model 48:766–784
Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics (2 volumes). Wiley-VCH
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754
Kearnes S, McCloskey K, Berndl M et al (2016) Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 30:595–608
Consonni V, Todeschini R, Pavan M (2002) Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors. J Chem Inf Comput Sci 42:682–692
Todeschini R, Lasagni M, Marengo E (1994) New molecular descriptors for 2D and 3D structures. Theory. J Chemom 8:263–272
Moriguchi I, HIRONO S, LIU Q et al (1992) Simple method of calculating octanol/water partition coefficient. Chem Pharm Bull (Tokyo) 40:127–130
Reutlinger M, Koch CP, Reker D et al (2013) Chemically advanced template search (CATS) for scaffold-hopping and prospective target prediction for ‘orphan’ molecules. Mol Inf 32:133–138
Schueler FWP (1960) Chemobiodynamics and drug design. McGraw-Hill Book Company, Inc., New York
Wermuth CG, Ganellin CR, Lindberg P et al (1998) Glossary of terms used in medicinal chemistry (IUPAC recommendations 1998). Pure Appl Chem 70:1129
Varnek A, Fourches D, Horvath D et al (2008) ISIDA-platform for virtual screening based on fragment and pharmacophoric descriptors. Curr Comput Aided Drug Des 4:191
Good AC, Cho S-J, Mason JS (2004) Descriptors you can count on? Normalized and filtered pharmacophore descriptors for virtual screening. J Comput Aided Mol Des 18:523–527
Pickett SD, Luttmann C, Guerin V et al (1998) DIVSEL and COMPLIB - strategies for the design and comparison of combinatorial libraries using pharmacophoric descriptors. J Chem Inf Comput Sci 38:144–150
Nettles JH, Jenkins JL, Williams C et al (2007) Flexible 3D pharmacophores as descriptors of dynamic biological space. Graham Richards 67th Birthd Honour Issue 26:622–633
Renner S, Hechenberger M, Noeske T et al (2007) Searching for drug scaffolds with 3D pharmacophores and neural network ensembles. Angew Chem Int Ed 46:5336–5339
Tanrikulu Y, Nietert M, Scheffer U et al (2007) Scaffold hopping by “fuzzy” pharmacophores and its application to RNA targets. Chembiochem 8:1932–1936
Stiefl N, Watson IA, Baumann K et al (2006) ErG: 2D pharmacophore descriptions for scaffold hopping. J Chem Inf Model 46:208–220
Jenkins JL, Glick M, Davies JW (2004) A 3D similarity method for scaffold hopping from known drugs or natural ligands to new chemotypes. J Med Chem 47:6144–6159
Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 25:64–73
Rodrigues T, Schneider G (2014) Flashback forward: reaction-driven de novo design of bioactive compounds. Synlett 25:170–178
Schneider G (2013) De novo design – hop(p)ing against hope. Drug Discov Today Technol 10:e453–e460
Awale M, Reymond J-L (2014) Atom pair 2D-fingerprints perceive 3D-molecular shape and pharmacophores for very fast virtual screening of ZINC and GDB-17. J Chem Inf Model 54:1892–1907
Grant JA, Gallardo MA, Pickup BT (1996) A fast method of molecular shape comparison: a simple application of a Gaussian description of molecular shape. J Comput Chem 17:1653–1666
Rush TS, Grant JA, Mosyak L et al (2005) A shape-based 3-D scaffold hopping method and its application to a bacterial protein−protein interaction. J Med Chem 48:1489–1495
Liu X, Jiang H, Li H (2011) SHAFTS: a hybrid approach for 3D molecular similarity calculation. 1. Method and assessment of virtual screening. J Chem Inf Model 51:2372–2385
Ge H, Wang Y, Zhao W et al (2014) Scaffold hopping of potential anti-tumor agents by WEGA: a shape-based approach. Med Chem Comm 5:737–741
Schuffenhauer A (2012) Computational methods for scaffold hopping. WIREs Comput Mol Sci 2:842–867
Grisoni F, Merk D, Consonni V et al (2018) Scaffold hopping from natural products to synthetic mimetics by holistic molecular similarity. Commun Chem 1:44
Grisoni F, Merk D, Byrne R et al (2018) Scaffold-hopping from synthetic drugs by holistic molecular representation. Sci Rep 8:16469
Todeschini R, Ballabio D, Consonni V et al (2013) Locally centred Mahalanobis distance: a new distance measure with salient features towards outlier detection. Anal Chim Acta 787:1–9
Grisoni F, Merk D, Friedrich L et al (2019) Design of natural-product-inspired multitarget ligands by machine learning. ChemMedChem 14:1129–1134
Merk D, Grisoni F, Friedrich L et al (2018) Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators. Commun Chem 1:68
Merk D, Friedrich L, Grisoni F et al (2018) De novo design of bioactive small molecules by artificial intelligence. Mol Inf 37
Merk D, Grisoni F, Friedrich L et al (2018) Computer-assisted discovery of retinoid X receptor modulating natural products and isofunctional mimetics. J Med Chem 61:5442–5447
Cao D-S, Liang Y-Z, Yan J et al (2013) PyDPI: freely available python package for chemoinformatics, bioinformatics, and chemogenomics studies. J Chem Inf Model 53:3086–3096
Nugmanov RI, Mukhametgaleev RN, Akhmetshin T et al (2019) CGRtools: python library for molecule, reaction, and condensed graph of reaction processing. J Chem Inf Model 59:2516–2521
Cao D-S, Xu Q-S, Hu Q-N et al (2013) ChemoPy: freely available python package for computational biology and chemoinformatics. Bioinformatics 29:1092–1094
Tangadpalliwar SR, Vishwakarma S, Nimbalkar R et al (2019) ChemSuite: a package for chemoinformatics calculations and machine learning. Chem Biol Drug Des 93:960–964
Müller AT, Gabernet G, Hiss JA et al (2017) modlAMP: Python for antimicrobial peptides. Bioinformatics 33:2753–2755
Kluyver T, Ragan-Kelley B, Pérez F et al (2016) Jupyter Notebooks – a publishing format for reproducible computational workflows. In: Loizides F, Schmidt B (eds) Positioning and Power in Academic Publishing: Players, Agents and Agendas. IOS Press, pp 87–90
Yan Y, Yan J (2018) Hands-on data science with Anaconda: utilize the right mix of tools to create high-performance data science applications. Packt Publishing Ltd
Loeliger J, McCullough M (2012) Version control with Git: powerful tools and techniques for collaborative software development. O’Reilly Media, Inc
Dabbish L, Stuart C, Tsay J et al (2012) Social coding in GitHub: transparency and collaboration in an open software repository. In: Proceedings of the ACM 2012 conference on computer supported cooperative work. Association for Computing Machinery, New York, pp 1277–1286
Koehn FE, Carter GT (2005) The evolving role of natural products in drug discovery. Nat Rev Drug Discov 4:206–220
Patridge E, Gareiss P, Kinch MS et al (2016) An analysis of FDA-approved drugs: natural products and their derivatives. Drug Discov Today 21:204–207
Lee M-L, Schneider G (2001) Scaffold architecture and pharmacophoric properties of natural products and trade drugs: application in the design of natural product-based combinatorial libraries. J Comb Chem 3:284–289
Brown DG, Lister T, May-Dracka TL (2014) New natural products as new leads for antibacterial drug discovery. Bioorg Med Chem Lett 24:413–418
Ertl P, Schuffenhauer A (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminformatics 1:8
Atanasov AG, Waltenberger B, Pferschy-Wenzig E-M et al (2015) Discovery and resupply of pharmacologically active plant-derived natural products: a review. Biotechnol Adv 33:1582–1614
Grabowski K, Proschak E, Baringhaus K-H et al (2008) Bioisosteric replacement of molecular scaffolds: from natural products to synthetic compounds. Nat Prod Commun 3:1934578X0800300821
Ongini E, Monopoli A, Cacciari B et al (2001) Selective adenosine A2A receptor antagonists. Il Farm 56:87–90
Lamberth C (2018) Agrochemical lead optimization by scaffold hopping. Pest Manag Sci 74:282–292
Wiley RA, Rich DH (1993) Peptidomimetics derived from natural products. Med Res Rev 13:327–384
Akbulut Y, Gaunt HJ, Muraki K et al (2015) (−)-Englerin A is a potent and selective activator of TRPC4 and TRPC5 calcium channels. Angew Chem Int Ed 54:3787–3791
Ratnayake R, Covell D, Ransom TT et al (2009) Englerin A, a selective inhibitor of renal cancer cell growth, from Phyllanthus engleri. Org Lett 11:57–60
Friedrich L, Byrne R, Treder A et al (2020) Shape similarity by fractal dimensionality: an application in de novo design of (−)-Englerin A mimetics, accepted. ChemMedChem 15:566
Sterling T, Irwin JJ (2015) ZINC 15 – ligand discovery for everyone. J Chem Inf Model 55:2324–2337
Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39:2887–2893
Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36
Dalby A, Nourse JG, Hounshell WD et al (1992) Description of several chemical structure file formats used by computer programs developed at molecular design limited. J Chem Inf Comput Sci 32:244–255
Halgren TA (1996) Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J Comput Chem 17:490–519
Gasteiger J, Marsili M (1980) Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges. Tetrahedron 36:3219–3228
Aradi B, Hourahine B, Frauenheim T (2007) DFTB+, a sparse matrix-based implementation of the DFTB method. J Phys Chem A 111:5678–5684
Blaschke T, Olivecrona M, Engkvist O et al (2018) Application of generative autoencoder in de novo molecular design. Mol Inf 37:1700123
Button A, Merk D, Hiss JA et al (2019) Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Nat Mach Intell 1:307
Hartenfeller M, Zettl H, Walter M et al (2012) DOGS: reaction-driven de novo design of bioactive compounds. PLoS Comput Biol 8:e1002380
Lloyd DG, Buenemann CL, Todorov NP et al (2004) Scaffold hopping in de novo design. Ligand generation in the absence of receptor information. J Med Chem 47:493–496
Truchon J-F, Bayly CI (2007) Evaluating virtual screening methods: good and bad metrics for the “early recognition” problem. J Chem Inf Model 47:488–508
Zhu T, Cao S, Su P-C et al (2013) Hit identification and optimization in virtual screening: practical recommendations based on a critical literature analysis. J Med Chem 56:6560–6572
Hu Y, Stumpfe D, Bajorath J (2011) Lessons learned from molecular scaffold analysis. J Chem Inf Model 51:1742–1753
Xu Y, Johnson M (2001) Algorithm for naming molecular equivalence classes represented by labeled pseudographs. J Chem Inf Comput Sci 41:181–185
Sauer WHB, Schwarz MK (2003) Size doesn’t matter: scaffold diversity, shape diversity and biological activity of combinatorial libraries. Chim Int J Chem 57:276–283
Medina-Franco JL, Martínez-Mayorga K, Bender A et al (2009) Scaffold diversity analysis of compound data sets using an entropy-based measure. QSAR Comb Sci 28:1551–1560
O’Boyle NM, Sayle RA (2016) Comparing structural fingerprints using a literature-based similarity benchmark. J Cheminformatics 8:36
Pyzer-Knapp O, EN, Simm G, Guzik AA (2016) A Bayesian approach to calibrating high-throughput virtual screening results and application to organic photovoltaic materials. Mater Horiz 3:226–233
Besnard J, Ruda GF, Setola V et al (2012) Automated design of ligands to polypharmacological profiles. Nature 492:215–220
Hert J, Willett P, Wilton DJ et al (2004) Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures. J Chem Inf Comput Sci 44:1177–1185
Ripphausen P, Nisius B, Peltason L et al (2010) Quo Vadis, virtual screening? A comprehensive survey of prospective applications. J Med Chem 53:8461–8467
Chen B, Mueller C, Willett P (2010) Combination rules for group fusion in similarity-based virtual screening. Mol Inf 29:533–541
Whittle M, Gillet VJ, Willett P et al (2006) Analysis of data fusion methods in virtual screening: similarity and group fusion. J Chem Inf Model 46:2206–2219
Willett P (2006) Enhancing the effectiveness of ligand-based virtual screening using data fusion. QSAR Comb Sci 25:1143–1152
Rybinska A, Sosnowska A, Barycki M et al (2016) Geometry optimization method versus predictive ability in QSPR modeling for ionic liquids. J Comput Aided Mol Des 30:165–176
Riniker S, Landrum GA (2015) Better informed distance geometry: using what we know to improve conformation generation. J Chem Inf Model 55:2562–2574
Nicklaus MC, Wang S, Driscoll JS et al (1995) Conformational changes of small molecules binding to proteins. Bioorg Med Chem 3:411–428
Tomich de Paula da Silva CH, Taft CA (2017) 3D descriptors calculation and conformational search to investigate potential bioactive conformations, with application in 3D-QSAR and virtual screening in drug design. J Biomol Struct Dyn 35:2966–2974
Perola E, Charifson PS (2004) Conformational analysis of drug-like molecules bound to proteins: an extensive study of ligand reorganization upon binding. J Med Chem 47:2499–2510
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Grisoni, F., Schneider, G. (2021). Molecular Scaffold Hopping via Holistic Molecular Representation. In: Ballante, F. (eds) Protein-Ligand Interactions and Drug Design. Methods in Molecular Biology, vol 2266. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1209-5_2
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1209-5_2
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1208-8
Online ISBN: 978-1-0716-1209-5
eBook Packages: Springer Protocols