Improved Prediction of Blood–Brain Barrier Permeability Through Machine Learning with Combined Use of Molecular Property-Based Descriptors and Fingerprints
Blood–brain barrier (BBB) permeability of a compound determines whether the compound can effectively enter the brain. It is an essential property which must be accounted for in drug discovery with a target in the brain. Several computational methods have been used to predict the BBB permeability. In particular, support vector machine (SVM), which is a kernel-based machine learning method, has been used popularly in this field. For SVM training and prediction, the compounds are characterized by molecular descriptors. Some SVM models were based on the use of molecular property-based descriptors (including 1D, 2D, and 3D descriptors) or fragment-based descriptors (known as the fingerprints of a molecule). The selection of descriptors is critical for the performance of a SVM model. In this study, we aimed to develop a generally applicable new SVM model by combining all of the features of the molecular property-based descriptors and fingerprints to improve the accuracy for the BBB permeability prediction. The results indicate that our SVM model has improved accuracy compared to the currently available models of the BBB permeability prediction.
KEY WORDSblood–brain barrier permeability molecular descriptor fingerprint physical property modeling
The authors acknowledge the Computer Center at the University of Kentucky for supercomputing time on a Dell Supercomputer Cluster consisting of 388 nodes or 4816 processors.
This work was supported in part by the National Science Foundation (NSF grant CHE-1111761) and the National Institutes of Health (NIH grants UH2/UH3 DA041115, R01 DA035552, R01 DA032910, R01 DA013930, R01 DA025100, and UL1TR001998).
Compliance with Ethical Standards
Conflict of Interest
The authors declare that they have no conflict of interest.
- 1.George A. The design and molecular modeling of CNS drugs. Curr Opin Drug Discov Dev. 1999;2(4):286–92.Google Scholar
- 23.Shen J, Du Y, Zhao Y, Liu G, Tang Y. In silico prediction of blood–brain partitioning using a chemometric method called genetic algorithm based variable selection. Mol Informatics. 2008;27(6):704–17.Google Scholar
- 29.Fröhlich H, Wegner J, Sieker F, Zell A. Kernel functions for attributed molecular graphs—a new similarity-based approach to ADME prediction in classification and regression. Mol Informatics. 2006;25(4):317–26.Google Scholar
- 32.Vapnik V. The nature of statistical learning theory. Berlin: Springer science & business media; 2013.Google Scholar
- 33.Trotter MWB. Support vector machines for drug discovery. London: University of London; 2007.Google Scholar
- 35.Collobert R, Bengio S. SVMTorch: Support vector machines for large-scale regression problems. J Mach Learn Res. 2001;1(Feb):143–60.Google Scholar
- 37.Gunn SR. Support vector machines for classification and regression. ISIS Tech Rep. 1998;14:85–6.Google Scholar
- 39.Hsu C-W, Chang C-C, Lin C-J. A practical guide to support vector classification. 2003.Google Scholar
- 41.Toolkits O. OpenEye Scientific Software, Santa Fe, NM. 2015.Google Scholar
- 48.Ring JR, Zheng F, Haubner AJ, Littleton JM, Crooks PA. Improving the inhibitory activity of arylidenaminoguanidine compounds at the N-methyl-D-aspartate receptor complex from a recursive computational-experimental structure-activity relationship study. Bioorg Med Chem. 2013;21:1764–74.CrossRefPubMedGoogle Scholar