Effects of inductive bias on computational evaluations of ligand-based modeling and on drug discovery

  • Ann E. Cleves
  • Ajay N. Jain


Inductive bias is the set of assumptions that a person or procedure makes in making a prediction based on data. Different methods for ligand-based predictive modeling have different inductive biases, with a particularly sharp contrast between 2D and 3D similarity methods. A unique aspect of ligand design is that the data that exist to test methodology have been largely man-made, and that this process of design involves prediction. By analyzing the molecular similarities of known drugs, we show that the inductive bias of the historic drug discovery process has a very strong 2D bias. In studying the performance of ligand-based modeling methods, it is critical to account for this issue in dataset preparation, use of computational controls, and in the interpretation of results. We propose specific strategies to explicitly address the problems posed by inductive bias considerations.


Inductive bias Ligand-based modeling Computational evaluation Molecular similarity Surflex-Sim 



The authors gratefully acknowledge NIH for partial funding of the work (grant GM070481). Drs. Jain and Cleves have a financial interest in BioPharmics LLC, a biotechnology company whose main focus is in the development of methods for computational modeling in drug discovery. Tripos Inc., has exclusive commercial distribution rights for Surflex-Sim, licensed from BioPharmics LLC.


  1. 1.
    Mitchell TM (1997) Machine learning. McGraw-Hill, New YorkGoogle Scholar
  2. 2.
    Crowe SM, Streetman DS (2004) Vardenafil treatment for erectile dysfunction. Ann Pharmacother 38(1):77–85CrossRefGoogle Scholar
  3. 3.
    Wright PJ (2006). Comparison of phosphodiesterase type 5 (PDE5) inhibitors. Int J Clin Pract 60(8):967–975CrossRefGoogle Scholar
  4. 4.
    Supuran CT, Mastrolorenzo A, Barbaro G, Scozzafava A (2006) Phosphodiesterase 5 inhibitors–drug design and differentiation based on selectivity, pharmacokinetic and efficacy profiles. Curr Pharm Des 12(27):3459–3465CrossRefGoogle Scholar
  5. 5.
    Huang N, Shoichet BK, Irwin JJ (2006) Benchmarking sets for molecular docking. J Med Chem 49(23):6789–6801CrossRefGoogle Scholar
  6. 6.
    Jain AN (2004) Ligand-based structural hypotheses for virtual screening. J Med Chem 47(4):947–961CrossRefGoogle Scholar
  7. 7.
    Cleves AE, Jain AN (2006) Robust ligand-based modeling of the biological targets of known drugs. J Med Chem 49(10):2921–2938CrossRefGoogle Scholar
  8. 8.
    Bissantz C, Folkers G, Rognan D (2000) Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J Med Chem 43(25):4759–4767CrossRefGoogle Scholar
  9. 9.
    Irwin JJ, Shoichet BK (2005) ZINC—a free database of commercially available compounds for virtual screening. J Chem Inf Model 45(1):177–182CrossRefGoogle Scholar
  10. 10.
    Jain AN (2007) Surflex-Dock 2.1: robust performance from ligand energetic modeling, ring flexibility, and knowledge-based search. J Comput Aided Mol Des 21(5):281–306CrossRefGoogle Scholar
  11. 11.
    Jain AN (2000) Morphological similarity: a 3D molecular similarity method correlated with protein-ligand recognition. J Comput Aided Mol Des 14(2):199–213CrossRefGoogle Scholar
  12. 12.
    Hawkins PC, Skillman AG, Nicholls A (2007) Comparison of shape-matching and docking as virtual screening tools. J Med Chem 50(1):74–82CrossRefGoogle Scholar
  13. 13.
    McGaughey GB, Sheridan RP, Bayly CI, Culberson JC, Kreatsoulas C, Lindsley S, Maiorov V, Truchon J-F, Cornell WD (2007) Comparison of topological, shape, and docking methods in virtual screening. J Chem Inf Model 47(4):1504–1519CrossRefGoogle Scholar
  14. 14.
    Ghuloum AM, Sage CR, Jain AN (1999) Molecular hashkeys: a novel method for molecular characterization and its application for predicting important pharmaceutical properties of molecules. J Med Chem 42(10):1739–1748CrossRefGoogle Scholar
  15. 15.
    Jaffe JH, Bloor R, Crome I, Carr M, Alam F, Simmons A, Meyer RE (2004) A postmarketing study of relative abuse liability of hypnotic sedative drugs. Addiction 99(2):165–173CrossRefGoogle Scholar
  16. 16.
    Clemett D, Jarvis B (2001) Tolterodine: a review of its use in the treatment of overactive bladder. Drugs Aging, 18(4):277–304CrossRefGoogle Scholar
  17. 17.
    Schreiber DH, Anderson TR (2006) Statin-induced rhabdomyolysis. J Emerg Med 31(2):177–180CrossRefGoogle Scholar
  18. 18.
    Waters DD (2005) Safety of high-dose atorvastatin therapy. Am J Cardiol 96(5A):69F-75FCrossRefGoogle Scholar
  19. 19.
    Istvan ES, Deisenhofer J (2001) Structural mechanism for statin inhibition of HMG-CoA reductase. Science 292(5519):1160–1164CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2007

Authors and Affiliations

  1. 1.BioPharmics LLCSan MateoUSA
  2. 2.University of California, San FranciscoSan FranciscoUSA

Personalised recommendations