Abstract
Methods of three-dimensional molecular alignment generally treat all pharmacophore features equally when superimposing. However, some pharmacophore features can be more important in a specific system. In this work, we derived the overlap volume of pharmacophore features from a molecular alignment approach as new features of molecules to build machine learning models. Features can be assigned weights to indicate their importance. With validation on DUD-E collection, models based on pharmacophore features represented by the overlap volume yielded significant performances with median AUC of approximately 0.98 and recall rate of almost 0.8.
Graphic abstract
Similar content being viewed by others
References
Ballester PJ, Richards WG (2007) Ultrafast shape recognition to search compound databases for similar molecular shapes. J Comput Chem 28:1711–1723. https://doi.org/10.1002/jcc.20681
Mavridis L, Hudson BD, Ritchie DW (2007) Toward high throughput 3D virtual screening using spherical harmonic surface representations. J Chem Inf Model 47:1787–1796. https://doi.org/10.1021/ci7001507
Nicholls A, McGaughey GB, Sheridan RP, Good AC, Warren G, Mathieu M, Muchmore SW, Brown SP, Grant JA, Haigh JA, Nevins N, Jain AN, Kelley B (2010) Molecular shape and medicinal chemistry: a perspective. J Med Chem 53:3862–3886. https://doi.org/10.1021/jm900818s
Vainio MJ, Puranen JS, Johnson MS (2009) ShaEP: molecular overlay based on shape and electrostatic potential. J Chem Inf Model 49:492–502. https://doi.org/10.1021/ci800315d
Liu X, Jiang H, Li H (2011) SHAFTS: a hybrid approach for 3D molecular similarity calculation. 1. Method and assessment of virtual screening. J Chem Inf Model 51:2372–2385. https://doi.org/10.1021/ci200060s
Hawkins PC, Skillman AG, Nicholls A (2007) Comparison of shape-matching and docking as virtual screening tools. J Med Chem 50:74–82. https://doi.org/10.1021/jm0603365
Yan X, Li J, Liu Z, Zheng M, Ge H, Xu J (2013) Enhancing molecular shape comparison by weighted Gaussian functions. J Chem Inf Model 53:1967–1978. https://doi.org/10.1021/ci300601q
Grant JA, Gallardo MA, Pickup BT (1996) A fast method of molecular shape comparison: a simple application of a Gaussian description of molecular shape. J Comput Chem 17:1653–1666. https://doi.org/10.1002/(SICI)1096-987X(19961115)17:14%3c1653:AID-JCC7%3e3.0.CO;2-K
Güner OF (2000) Pharmacophore perception, development, and use in drug design. International University Line, La Jolla
Kearnes S, Pande V (2016) ROCS-derived features for virtual screening. J Comput Aided Mol Des 30:609–617. https://doi.org/10.1007/s10822-016-9959-3
James LM, Edmund KB, Jonathan DH (2009) Machine learning in virtual screening. Comb Chem High Throughput Screen 12:332–343. https://doi.org/10.2174/138620709788167980
Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today 12:225–233. https://doi.org/10.1016/j.drudis.2007.01.011
Jorissen RN, Gilson MK (2005) Virtual screening of molecular databases using a support vector machine. J Chem Inf Model 45:549–561. https://doi.org/10.1021/ci049641u
Heikamp K, Bajorath J (2014) Support vector machines for drug discovery. Expert Opin Drug Discov 9:93–104. https://doi.org/10.1517/17460441.2014.866943
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. Paper presented at the proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, California, USA, August 13–17
Breiman L (2017) Classification and regression trees. Routledge, London
Mysinger MM, Carchia M, Irwin JJ, Shoichet BK (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem 55:6582–6594. https://doi.org/10.1021/jm300687e
Mason J, Good A, Martin EJ (2001) 3-D pharmacophores in drug discovery. Curr Pharm Des 7:567–597. https://doi.org/10.2174/1381612013397843
Li J, Ehlers T, Sutter J, Varma-O’brien S, Kirchmair J (2007) CAESAR: a new conformer generation algorithm based on recursive buildup and local rotational symmetry consideration. J Chem Inf Model 47:1923–1932. https://doi.org/10.1021/ci700136x
Inc AS (2012) Discovery studio modeling environment, release 3.5. Accelrys Discovery Studio Accelrys Software Inc, San Diego
Max K (2008) Building predictive models in R using the caret package. J Stat Softw 28:1–26. https://doi.org/10.18637/jss.v028.i05
Team RC (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22. https://doi.org/10.1016/j.jspi.2009.07.020
Karatzoglou A, Smola A, Hornik K (2004) kernlab—an S4 package for kernel methods in R. J Stat Softw 69:721–729. https://doi.org/10.18637/jss.v011.i09
Chen T, He T, Benesty M, Khotilovich V, Tang Y (2016) Xgboost: extreme gradient boosting. R package version 0.71.2
Funding
This research was funded by the Taishan Scholar Program of Shandong Province (tsqn201812159) and the Foundation of Clinical Pharmacy of Chinese Medical Association (LCYX-M008).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Xiaojing Wang and Wenxiu Han are co-first authors.
Rights and permissions
About this article
Cite this article
Wang, X., Han, W., Yan, X. et al. Pharmacophore features for machine learning in pharmaceutical virtual screening. Mol Divers 24, 407–412 (2020). https://doi.org/10.1007/s11030-019-09961-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11030-019-09961-4