Effects of inductive bias on computational evaluations of ligand-based modeling and on drug discovery
Inductive bias is the set of assumptions that a person or procedure makes in making a prediction based on data. Different methods for ligand-based predictive modeling have different inductive biases, with a particularly sharp contrast between 2D and 3D similarity methods. A unique aspect of ligand design is that the data that exist to test methodology have been largely man-made, and that this process of design involves prediction. By analyzing the molecular similarities of known drugs, we show that the inductive bias of the historic drug discovery process has a very strong 2D bias. In studying the performance of ligand-based modeling methods, it is critical to account for this issue in dataset preparation, use of computational controls, and in the interpretation of results. We propose specific strategies to explicitly address the problems posed by inductive bias considerations.
KeywordsInductive bias Ligand-based modeling Computational evaluation Molecular similarity Surflex-Sim
The authors gratefully acknowledge NIH for partial funding of the work (grant GM070481). Drs. Jain and Cleves have a financial interest in BioPharmics LLC, a biotechnology company whose main focus is in the development of methods for computational modeling in drug discovery. Tripos Inc., has exclusive commercial distribution rights for Surflex-Sim, licensed from BioPharmics LLC.
- 1.Mitchell TM (1997) Machine learning. McGraw-Hill, New YorkGoogle Scholar