A support vector machine approach to classify human cytochrome P450 3A4 inhibitors
- 140 Downloads
The cytochrome P450 (CYP) enzyme superfamily plays a major role in the metabolism of commercially available drugs. Inhibition of these enzymes by a drug may result in a plasma level increase of another drug, thus leading to unwanted drug–drug interactions when two or more drugs are coadministered. Therefore, fast and reliable in silico methods predicting CYP inhibition from calculated molecular properties are an important tool which can be applied to assess both already synthesized as well as virtual compounds. We have studied the performance of support vector machines (SVMs) to classify compounds according to their potency to inhibit CYP3A4. The data set for model generation consists of more than 1300 structural diverse drug-like research molecules which were divided into training and test sets. The predictive power of SVMs crucially depends on a careful selection of parameters specifying the kernel function and the penalty for misclassifications. In this study we have investigated a procedure to identify a valid set of SVM parameters which is based on a sampling of the parameter space on a regular grid. From this set of parameters, either single SVMs or SVM committees were trained to distinguish between strong and weak inhibitors or to achieve a more realistic three-class assignment, with one class representing medium inhibitors. This workflow was studied for several kernel functions and descriptor sets. All SVM models performed significantly better than PLS-DA models which were generated from the corresponding descriptor sets. As a very promising result, simple two-dimensional (2D) descriptors yield a three-class model which correctly classifies more than 70% of the test set. Our work illustrates that SVMs used in combination with simple 2D descriptors provide a very effective and reliable tool which allows a fast assessment of CYP3A4 inhibition potency in an early in silico filtering process.
KeywordsADME cytochrome P450 in silico filter molecular descriptor QSAR support vector machine
absorption, distribution, metabolism and excretion
partial least squares
support vector machine(s)
radial basis function.
Unable to display preview. Download preview PDF.
- 5.Miller, V.P., Stresser, D.M., Blanchard, A.P., Turner, S., Crespi, C.L. 2000Ann. NewYork Acad. Sci.91926Google Scholar
- 7.Böhm, H.-J., Schneider, G. 2000Virtual Screening for Bioactive MoleculesWiley-VCHNew YorkGoogle Scholar
- 25.Vapnik, V. 1995The Nature of Statistical Learning TheorySpringerNew YorkGoogle Scholar
- 29.Trotter, M.W.B., Holden, S.B. 2003Quant. Struct. Act. Relat.22533Google Scholar
- 33.Cortes, C., Vapnik, V. 1995Mach. Learn.20273Google Scholar
- 36.These descriptors are calculated by a Boehringer Ingelheim in-house software package (propty, developed by K.M. Hasselbach)Google Scholar
- 37.Molecular Operating Environment Release 2003.2, Chemical Computing Group, Montreal, Canada, 2003Google Scholar
- 38.VolSurf 3.0.11, Molecular Discovery Ltd., London, UK, 2004Google Scholar
- 40.CORINA 3.1, Molecular Networks GmbH, Erlangen, Germany, 2004Google Scholar
- 42.VAMP 8.1, University of Erlangen, Erlangen, Germany (This version is provided as part of Materials Studio 2.2.1 by Accelrys, Inc.), 2003Google Scholar
- 44.Kennard, R.W., Stone, L.A. 1969Technometrics11137Google Scholar
- 47.SIMCA-P+ 10, Umetrics AB, Umeå, Sweden, 2004Google Scholar
- 48.Wold, S. 1978Technometrics20397Google Scholar
- 49.LIBSVM 2.5 National Taiwan University, 2003; http://www.csie.ntu.edu.tw/∼ ∼cjlin/libsvm/index.htmlGoogle Scholar