Skip to main content
Log in

ROCS-derived features for virtual screening

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Rapid overlay of chemical structures (ROCS) is a standard tool for the calculation of 3D shape and chemical (“color”) similarity. ROCS uses unweighted sums to combine many aspects of similarity, yielding parameter-free models for virtual screening. In this report, we decompose the ROCS color force field into color components and color atom overlaps, novel color similarity features that can be weighted in a system-specific manner by machine learning algorithms. In cross-validation experiments, these additional features significantly improve virtual screening performance relative to standard ROCS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Ballester PJ, Richards WG (2007) Ultrafast shape recognition to search compound databases for similar molecular shapes. J Comput Chem 28(10):1711–1723

    Article  CAS  Google Scholar 

  2. Böhm H-J, Flohr A, Stahl M (2004) Scaffold hopping. Drug Discov Today Technol 1(3):217–224

    Article  Google Scholar 

  3. Chen B, Mueller C, Willett P (2010) Combination rules for group fusion in similarity-based virtual screening. Mol Inform 29(6–7):533–541

    Article  CAS  Google Scholar 

  4. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874

    Article  Google Scholar 

  5. Gaulton A, Bellis LJ, Patricia Bento A, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B et al (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(D1):D1100–D1107

    Article  CAS  Google Scholar 

  6. Grant JA, Gallardo MA, Pickup BT (1996) A fast method of molecular shape comparison: a simple application of a Gaussian description of molecular shape. J Comput Chem 17(14):1653–1666

    Article  CAS  Google Scholar 

  7. Grant JA, Pickup BT (1995) A Gaussian description of molecular shape. J Phys Chem 99(11):3503–3510

    Article  CAS  Google Scholar 

  8. Hawkins PCD, Skillman AG, Nicholls A (2007) Comparison of shape-matching and docking as virtual screening tools. J Med Chem 50(1):74–82

    Article  CAS  Google Scholar 

  9. Hawkins PCD, Skillman AG, Warren GL, Ellingson BA, Stahl MT (2010) Conformer generation with OMEGA: algorithm and validation using high quality structures from the protein databank and cambridge structural database. J Chem Inf Model 50(4):572–584

    Article  CAS  Google Scholar 

  10. Horvath D, Marcou G, Varnek A (2013) Do not hesitate to use Tversky–and other hints for successful active analogue searches with feature count descriptors. J Chem Inf Model 53(7):1543–1562

    Article  CAS  Google Scholar 

  11. Irwin JJ (2008) Community benchmarks for virtual screening. J Comput Aided Mol Des 22(3–4):193–199

    Article  CAS  Google Scholar 

  12. Jain AN, Nicholls A (2008) Recommendations for evaluation of computational methods. J Comput Aided Mol Des 22(3–4):133–139

    Article  CAS  Google Scholar 

  13. Muchmore SW, Souers AJ, Akritopoulou-Zanze I (2006) The use of three-dimensional shape and electrostatic similarity searching in the identification of a melanin-concentrating hormone receptor 1 antagonist. Chem Biol Drug Des 67(2):174–176

    Article  CAS  Google Scholar 

  14. Mysinger MM, Carchia M, Irwin JJ, Shoichet BK (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem 55(14):6582–6594

    Article  CAS  Google Scholar 

  15. OEChem Toolkit. http://www.eyesopen.com. OpenEye Scientific Software, Santa Fe, NM

  16. OMEGA 2.5.1.4. http://www.eyesopen.com. OpenEye Scientific Software, Santa Fe, NM

  17. OpenEye Shape Toolkit. http://www.eyesopen.com. OpenEye Scientific Software, Santa Fe, NM

  18. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  19. Riniker S, Landrum GA (2013) Open-source platform to benchmark fingerprints for ligand-based virtual screening. J Cheminform 5(1):1–17

    Article  Google Scholar 

  20. Riniker S, Fechner N, Landrum GA (2013) Heterogeneous classifier fusion for ligand-based virtual screening: or, how decision making by committee can be a good thing. J Chem Inf Model 53(11):2829–2836

    Article  CAS  Google Scholar 

  21. ROCS 3.2.1.4. http://www.eyesopen.com. OpenEye Scientific Software, Santa Fe, NM

  22. Rohrer SG, Baumann K (2009) Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data. J Chem Inf Model 49(2):169–184

    Article  CAS  Google Scholar 

  23. Sato T, Yuki H, Takaya D, Sasaki S, Tanaka A, Honma T (2012) Application of support vector machine to three-dimensional shape-based virtual screening using comprehensive three-dimensional molecular shape overlay with known inhibitors. J Chem Inf Model 52(4):1015–1026

    Article  CAS  Google Scholar 

  24. Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with Python. In: Proceedings of the 9th Python in science conference, pp 57–61

  25. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43(6):1947–1958

    Article  CAS  Google Scholar 

  26. Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics, volume 41 (2 volume set), vol 41. Wiley, New York

    Book  Google Scholar 

  27. VIDA 4.3.0. http://www.eyesopen.com. OpenEye Scientific Software, Santa Fe, NM

  28. Willett P (2009) Similarity methods in chemoinformatics. Annu Rev Inf Sci Technol 43(1):1–117

    Article  Google Scholar 

Download references

Acknowledgments

We thank Paul Hawkins, Brian Cole, Anthony Nicholls, Brooke Husic, and Evan Feinberg for helpful discussion. We also acknowledge use of the Stanford BioX3 cluster supported by NIH S10 Shared Instrumentation Grant 1S10RR02664701. S.K. was supported by a Smith Stanford Graduate Fellowship. We also acknowledge support from NIH 5U19AI109662-02.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven Kearnes.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 293 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kearnes, S., Pande, V. ROCS-derived features for virtual screening. J Comput Aided Mol Des 30, 609–617 (2016). https://doi.org/10.1007/s10822-016-9959-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-016-9959-3

Keywords

Navigation