Abstract
Four different ligand-based virtual screening scenarios are studied: (1) prioritizing compounds for subsequent high-throughput screening (HTS); (2) selecting a predefined (small) number of potentially active compounds from a large chemical database; (3) assessing the probability that a given structure will exhibit a given activity; (4) selecting the most active structure(s) for a biological assay. Each of the four scenarios is exemplified by performing retrospective ligand-based virtual screening for eight different biological targets using two large databases—MDDR and WOMBAT. A comparison between the chemical spaces covered by these two databases is presented. The performance of two techniques for ligand-based virtual screening—similarity search with subsequent data fusion (SSDF) and novelty detection with Self-Organizing Maps (ndSOM) is investigated. Three different structure representations—2,048-dimensional Daylight fingerprints, topological autocorrelation weighted by atomic physicochemical properties (sigma electronegativity, polarizability, partial charge, and identity) and radial distribution functions weighted by the same atomic physicochemical properties—are compared. Both methods were found applicable in scenario one. The similarity search was found to perform slightly better in scenario two while the SOM novelty detection is preferred in scenario three. No method/descriptor combination achieved significant success in scenario four.
Similar content being viewed by others
References
Walters WP, Stahl MT, Murcko MA (1998) Drug Discov Today 3:160
Bajorath J (2001) J Chem Inf Model 41:233
Bajorath J (2002) Nat Rev Drug Discov 1:882
Oprea TI, Matter H (2004) Curr Opin Chem Biol 8:349
Willett P, Barnard JM, Downs GM (1998) J Chem Inf Model 38:983
Bleicher KH, Bohm HJ, Muller K, Alanine A (2003) Nat Rev Drug Discov 2:369
Kearsley SK, Sallamack S, Fluder EM, Andose JD, Mosley RT, Sheridan RP (1996) J Chem Inf Model 36:118
Bologa C, Revankar CM, Young SM, Edwards BS, Arterburn JB, Kiselyov AS, Parker MA, Tkachenko SE, Savchuck NP, Sklar LA, Oprea TI, Prossnitz ER (2006) Nat Chem Biol 2:207
Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2004) J Chem Inf Model 44:1177
Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2004) Org Biomol Chem 2:3256
Bender A, Jenkins JL, Glick M, Deng Z, Nettles JH, Davies JW (2006) J Chem Inf Model 46:2445
Chen B, Harrison RF, Papadatos G, Willett P, Wood DJ, Lewell XQ, Greenidge P, Stiefl N (2007) J Comput Aid Mol Des 21:53
Martin YC, Kofron JL, Traphagen LM (2002) J Med Chem 45:4350
Matter H (1997) J Med Chem 40:1219
Martin YC (2006) QSAR Comb Sci 25:1192
Markou M, Singh S (2003) Signal Process 83:2481
Markou M, Singh S (2003) Signal Process 83:2499
Hristozov D, Oprea TI, Gasteiger J (2007) J Chem Inf Model. http://www.pubs3.acs.org/acs/journals/doilookup?in_doi=10.1021/ci700040r
Kitchen DB, Decornez H, Furr JR, Bajorath J (2004) Nat Rev Drug Discov 3:935
Sousa SF, Fernandes PA, Ramos MG (2006) Proteins 65:15
Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G, Wall ID, Woolven JM, Peishoff CE, Head MS (2006) J Med Chem 49:5912
MDL Drug Data Report, version 2006.1
Olah M, Mracec M, Ostopovici L, Rad R, Bora A, Hadaruga N, Olah I, Banda M, Simon Z, Mracec M, Oprea TI (2003) In: Oprea TI (ed) Cheminformatics in drug discovery. Wiley-VCH, New York, pp 223–239
Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2004) J Chem Inf Model 44:1177
Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2004) Org Biomol Chem 2:3256
Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2005) J Med Chem 48:7049
Taylor R (1995) J Chem Inf Model 35:59
Butina D (1999) J Chem Inf Model 39:747
Truchon JF, Bayly CI (2007) J Chem Inf Model 47:488
Edgar SJ, Holliday JD, Willett P (2000) J Mol Graph Model 18:343
Hanley JA, McNeil BJ (1982) Radiology 143:29
Hanley JA, McNeil BJ (1983) Radiology 148:839
Triballeau N, Acher F, Brabet I, Pin JP, Bertrand HO (2005) J Med Chem 48:2534
Cleves AE, Jain AN (2006) J Med Chem 49:2921
Witten IH, Eibe F (2000) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Francisco
Yao YY (1995) J Am Soc Inf Sci 46:133
Whittle M, Gillet VJ, Willett P, Alex A, Loesel J (2004) J Chem Inf Model 44:1840
Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2005) J Med Chem 48:7049
Kohonen T (2001) Self-organizing maps. Springer, Berlin
Sykora V (2007) Chemical descriptors library. Retrieved from cdelib.sourceforge.net 01/2007
Moreau G, Broto P (1980) New J Chem 4:359
Bauknecht H, Zell A, Bayer H, Levi P, Wagener M, Sadowski J, Gasteiger J (1996) J Chem Inf Model 36:1205
Spycher S, Pellegrini E, Gasteiger J (2005) J Chem Inf Model 45:200
Fechner U, Franke L, Renner S, Schneider P, Schneider G (2003) J Comput Aid Mol Des 17:687
Spycher S, Nendza M, Gasteiger J (2004) QSAR Comb Sci 23:779
Teckentrup A, Briem H, Gasteiger J (2004) J Chem Inf Model 44:626
Hutchings MG, Gasteiger J (1983) Tetrahedron Lett 24:2541
Gasteiger J, Hutchings MG (1983) Tetrahedron Lett 24:2537
Gasteiger J, Marsili M (1980) Tetrahedron 36:3219
Hollas B (2003) J Math Chem V33:91
ADRIANA.Code, version.1.0, 2006, Molecular Networks GmbH, Erlangen, Germany. http://www.molecular-networks.com
Hemmer MC, Steinhauer V, Gasteiger J (1999) Vib Spectrosc 19:151
Sadowski J, Gasteiger J (1993) Chem Rev 93:2567
CORINA, version 3.2. 2003, Molecular Networks GmbH, Erlangen, Germany. http://www.molecular-networks.co
Johnson M, MeqiLite, version 2.30, 2007, Pannanugget Consulting L.L.C., Kalamazoo, MI, USA. http://www.pannanugget.com
Johnson M (2006) An introduction to the MeqiSuite Indices. Pannanugget Consulting L.L.C. http://www.pannanugget.com/MeqiSuiteIntro.pdf
Sammon JR (1969) IEEE T Comput C-18:401
R Development Core Team, R: A language and environment for statistical computing, version 2.0, 2005. http://www.r-project.org/
Venables WN, Ripley BD (2002) Modern applied statistics with S. Springer, New York, USA
Brown RD, Martin YC (1997) J Chem Inf Model 37:1
Renner S, Schwab CH, Gasteiger J, Schneider G (2006) J Chem Inf Model 46:2324
Ginn C, Willett P, Bradshaw J (2000) Persp Drug Discov Des 20:1
Sheridan RP, Kearsley SK (2002) Drug Discov Today 7:903
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hristozov, D.P., Oprea, T.I. & Gasteiger, J. Virtual screening applications: a study of ligand-based methods and different structure representations in four different scenarios. J Comput Aided Mol Des 21, 617–640 (2007). https://doi.org/10.1007/s10822-007-9145-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-007-9145-8