Journal of Computer-Aided Molecular Design

, Volume 30, Issue 9, pp 761–771 | Cite as

A D3R prospective evaluation of machine learning for protein-ligand scoring

  • Jocelyn Sunseri
  • Matthew Ragoza
  • Jasmine Collins
  • David Ryan KoesEmail author


We assess the performance of several machine learning-based scoring methods at protein-ligand pose prediction, virtual screening, and binding affinity prediction. The methods and the manner in which they were trained make them sufficiently diverse to evaluate the utility of various strategies for training set curation and binding pose generation, but they share a novel approach to classification in the context of protein-ligand scoring. Rather than explicitly using structural data such as affinity values or information extracted from crystal binding poses for training, we instead exploit the abundance of data available from high-throughput screening to approach the problem as one of discriminating binders from non-binders. We evaluate the performance of our various scoring methods in the 2015 D3R Grand Challenge and find that although the merits of some features of our approach remain inconclusive, our scoring methods performed comparably to a state-of-the-art scoring function that was fit to binding affinity data.


Protein-ligand scoring Machine learning Virtual screening D3R 



We thank the organizers of D3R for their time and effort in running this invaluable exercise. We also are grateful for Nick Rego for his code for calculating SASA protein-ligand interaction terms.

Compliance with ethical standards


National Institute of General Medical Sciences [R01GM108340].

Supplementary material

10822_2016_9960_MOESM1_ESM.pdf (1.1 mb)
Supplementary material 1 (PDF 1107 kb)


  1. 1.
    DeWitte RS, Shakhnovich EI (1996) SMoG: de Novo design method based on simple, fast, and accurate free energy estimates.1. Methodology and supporting evidence. J Am Chem Soc 118(47):11733–11744CrossRefGoogle Scholar
  2. 2.
    McInnes C (2007) Virtual screening strategies in drug discovery. Curr Opin Chem Biol 11(5):494–502. doi: 10.1016/j.cbpa.2007.08.033 CrossRefGoogle Scholar
  3. 3.
    Charifson PS, Corkery JJ, Murcko MA, Walters WP (1999) Consensus scoring: a method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins. J Med Chem 42(25):5100–5109CrossRefGoogle Scholar
  4. 4.
    Wang R, Lu Y, Wang S (2003) Comparative evaluation of 11 scoring functions for molecular docking. J Med Chem 46(12):2287–2303. doi: 10.1021/jm0203783 CrossRefGoogle Scholar
  5. 5.
    Kitchen DB, Decornez H, Furr JR, Bajorath J (2004) Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov 3(11):935–949. doi: 10.1038/nrd1549 CrossRefGoogle Scholar
  6. 6.
    Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G, Wall ID, Woolven JM, Peishoff CE, Head MS (2006) A critical assessment of docking programs and scoring functions. J Med Chem 49(20):5912–5931. doi: 10.1021/jm050362n CrossRefGoogle Scholar
  7. 7.
    Cheng T, Li X, Li Y, Liu Z, Wang R (2009) Comparative assessment of scoring functions on a diverse test set. J Chem Inf Model 49(4):1079–1093. doi: 10.1021/ci9000053 CrossRefGoogle Scholar
  8. 8.
    Cheng T, Li Q, Zhou Z, Wang Y, Bryant SH (2012) Structure-based virtual screening for drug discovery: a problem-centric review. AAPS J 14(1):133–141. ISSN 1550-7416 (Electronic) 1550-7416 (Linking). doi: 10.1208/s12248-012-9322-0. URL
  9. 9.
    Smith RD, Dunbar JB, Ung PM-U, Esposito EX, Yang C-Y, Wang S, Carlson HA (2011) CSAR benchmark exercise of 2010: combined evaluation across all submitted scoring functions. J Chem Inf Model 51(9):2115–2131. doi: 10.1021/ci200269q CrossRefGoogle Scholar
  10. 10.
    Huang S-Y, Zou X (2011) Scoring and lessons learned with the CSAR benchmark using an improved iterative knowledge-based scoring function. J Chem Inf Model 51(9):2097–2106. doi: 10.1021/ci2000727 CrossRefGoogle Scholar
  11. 11.
    DesJarlais RL, Sheridan RP, Seibel GL, Dixon JS, Kuntz ID, Venkataraghavan R (1988) Using shape complementarity as an initial screen in designing ligands for a receptor binding site of known three-dimensional structure. J Med Chem 31(4):722–729CrossRefGoogle Scholar
  12. 12.
    Schneider G (2010) Virtual screening: an endless staircase? Nat Rev Drug Discov 9(4):273–276. doi: 10.1038/nrd3139 CrossRefGoogle Scholar
  13. 13.
    Hsieh J-H, Yin S, Liu S, Sedykh A, Dokholyan NV, Tropsha A (2011) Combined application of cheminformatics- and physical force field-based scoring functions improves binding affinity prediction for CSAR data sets. J Chem Inf Model 51(9):2027–2035. doi: 10.1021/ci200146e CrossRefGoogle Scholar
  14. 14.
    Matthias R, Bernd K, Thomas L, Gerhard K (1996) A fast flexible docking method using an incremental construction algorithm. J Mol Biol 261(3):470–489. ISSN 0022-2836. URL doi: 10.1006/jmbi.1996.0477
  15. 15.
    Wang R, Liu L, Lai L, Tang Y (1998) SCORE: a new empirical method for estimating the binding affinity of a protein-ligand complex. J Mol Model 4:379–394CrossRefGoogle Scholar
  16. 16.
    Harder E, Damm W, Maple J, Chuanjie W, Reboul M, Xiang JY, Wang L, Lupyan D, Dahlgren MK, Knight JL, Kaus JW, Cerutti DS, Krilov G, Jorgensen WL, Abel R, Friesner RA (2016) OPLS3: a force field providing broad coverage of drug-like small molecules and proteins. J Chem Theor Comput 12(1):281–296. doi: 10.1021/acs.jctc.5b00864 CrossRefGoogle Scholar
  17. 17.
    Yin S, Biedermannova L, Vondrasek J, Dokholyan NV (2008) MedusaScore: an accurate force field-based scoring function for virtual drug screening. J Chem Inf Model 48(8):1656–1662. doi: 10.1021/ci8001167 CrossRefGoogle Scholar
  18. 18.
    Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, Merz KM, Onufriev A, Simmerling C, Wang B, Woods RJ (2005) The amber biomolecular simulation programs. J Comput Chem 26(16):1668–1688. doi: 10.1002/jcc.20290 CrossRefGoogle Scholar
  19. 19.
    Ewing TJ, Makino S, Skillman AG, Kuntz ID (2001) DOCK 4.0: search strategies for automated molecular docking of flexible molecule databases. J Comput Aided Mol Des 15(5):411–428CrossRefGoogle Scholar
  20. 20.
    Brooks BR, Bruccoleri RE, Olafson BD (1983) CHARMM: a program for macromolecular energy, minimization, and dynamics calculations. J Comput Chem 4(2):187–217CrossRefGoogle Scholar
  21. 21.
    Lindahl E, Hess B, Van Der Spoel D (2001) GROMACS 3.0: a package for molecular simulation and trajectory analysis. J Mol Model 7(8):306–317CrossRefGoogle Scholar
  22. 22.
    Jorgensen WL, Maxwell DS, Tirado-Rives J (1996) Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids. J Am Chem Soc 118(45):11225–11236CrossRefGoogle Scholar
  23. 23.
    Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267(3):727–48. doi: 10.1006/jmbi.1996.0897 CrossRefGoogle Scholar
  24. 24.
    Koes DR, Baumgartner MP, Camacho CJ (2013) Learned lessons, in empirical scoring with smina from the CSAR, (2011) benchmarking exercise. J Chem Inf Model 53(8):1893. doi: 10.1021/ci300604z CrossRefGoogle Scholar
  25. 25.
    Eldridge MD, Murray CW, Auton TR, Paolini GV, Mee RP (1997) Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. J Comput Aided Mol Des 11(5):425–45CrossRefGoogle Scholar
  26. 26.
    Böhm HJ (1994) The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure. J Comput-Aided Mol Des 8(3):243–256CrossRefGoogle Scholar
  27. 27.
    Wang R, Lai L, Wang S (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J Comput-Aided Mol Des 16(1):11–26CrossRefGoogle Scholar
  28. 28.
    Korb O, Stützle T, Exner TE (2009) Empirical scoring functions for advanced protein-ligand docking with PLANTS. J Chem Inf Model 49(1):84–96. doi: 10.1021/ci800298z CrossRefGoogle Scholar
  29. 29.
    Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK, Shaw DE, Francis P, Shenkin PS (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47(7):1739–49. doi: 10.1021/jm0306430 CrossRefGoogle Scholar
  30. 30.
    Trott O, Olson AJ (2009) AutoDock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comp Chem 31(2):455. doi: 10.1002/jcc.21334 Google Scholar
  31. 31.
    Huang SY, Zou X (2010) Mean-force scoring functions for protein-ligand binding. Annu Rep Comp Chem 6:280–296CrossRefGoogle Scholar
  32. 32.
    Muegge I, Martin YC (1999) A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J Med Chem 42(5):791–804. doi: 10.1021/jm980536j CrossRefGoogle Scholar
  33. 33.
    Gohlke H, Hendlich M, Klebe G (2000) Knowledge-based scoring function to predict protein-ligand interactions. J Mol Biol 295(2):337–356CrossRefGoogle Scholar
  34. 34.
    Zhou H, Skolnick J (2011) GOAP: a generalized orientation-dependent, all-atom statistical potential for protein structure prediction. Biophys J 101(8):2043–2052. doi: 10.1016/j.bpj.2011.09.012 CrossRefGoogle Scholar
  35. 35.
    Mooij WT, Verdonk ML (2005) General and targeted statistical potentials for protein-ligand interactions. Proteins 61(2):272–287. doi: 10.1002/prot.20588 CrossRefGoogle Scholar
  36. 36.
    Ballester PJ, Mitchell JBO (2010) A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics 26(9):1169. doi: 10.1093/bioinformatics/btq112 CrossRefGoogle Scholar
  37. 37.
    Huang SY, Zou X (2006) An iterative knowledge-based scoring function to predict protein-ligand interactions: II. Validation of the scoring function. J Comput Chem 27(15):1876–1882. doi: 10.1002/jcc.20505 CrossRefGoogle Scholar
  38. 38.
    Raúl R (2013) Neural networks: a systematic introduction. Springer, New YorkGoogle Scholar
  39. 39.
    LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRefGoogle Scholar
  40. 40.
    Ashtawy HM, Mahapatra NR (2015) Machine-learning scoring functions for identifying native poses of ligands docked to known and novel proteins. BMC Bioinform 16(6):1–17. doi: 10.1186/1471-2105-16-S6-S3 Google Scholar
  41. 41.
    Jorissen RN, Gilson MK (2005) Virtual screening of molecular databases using a support vector machine. J Chem Inf Model 45(3):549–561. doi: 10.1021/ci049641u CrossRefGoogle Scholar
  42. 42.
    Sato T, Honma T, Yokoyama S (2009) Combining machine learning and pharmacophore-based interaction fingerprint for in silico screening. J Chem Inf Model 50(1):170–185. doi: 10.1021/ci900382e CrossRefGoogle Scholar
  43. 43.
    Durrant JD, Amaro RE (2015) Machine-learning techniques applied to antibacterial drug discovery. Chem Biol Drug Des 85(1):14–21. doi: 10.1111/cbdd.12423 CrossRefGoogle Scholar
  44. 44.
    Chupakhin V, Marcou G, Baskin I, Varnek A, Rognan D (2013) Predicting ligand binding modes from neural networks trained on protein-ligand interaction fingerprints. J Chem Inf Model 53(4):763–772. doi: 10.1021/ci300200r CrossRefGoogle Scholar
  45. 45.
    Zilian D, Sotriffer CA (2013) Sfcscore rf: a random forest-based scoring function for improved affinity prediction of protein-ligand complexes. J Chem Inf Model 53(8):1923–1933. doi: 10.1021/ci400120b CrossRefGoogle Scholar
  46. 46.
    Schietgat L, Fannes T, Ramon J (2015) Predicting protein function and protein-ligand interaction with the 3D neighborhood kernel. In: Japkowicz N, Matwin S (eds) Discovery Science, pages 221–235. SpringerGoogle Scholar
  47. 47.
    Durrant JD, McCammon JA (2010) Nnscore: a neural-network-based scoring function for the characterization of protein-ligand complexes. J Chem Inf Model 50(10):1865–1871. doi: 10.1021/ci100244v CrossRefGoogle Scholar
  48. 48.
    Durrant JD, McCammon JA (2011) Nnscore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model 51(11):2897–2903. doi: 10.1021/ci2003889 CrossRefGoogle Scholar
  49. 49.
    Deng W, Breneman C, Embrechts MJ (2004) Predicting protein-ligand binding affinities using novel geometrical descriptors and machine-learning methods. J Chem Inf Comput Sci 44(2):699–703. doi: 10.1021/ci034246+ CrossRefGoogle Scholar
  50. 50.
    Kramer C, Gedeck P (2010) Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets. J Chem Inf Model 50(11):1961–1969. doi: 10.1021/ci100264e CrossRefGoogle Scholar
  51. 51.
    Gabel J, Desaphy J, Rognan D (2014) Beware of machine learning-based scoring functions? On the danger of developing black boxes. J Chem Inf Model 54(10):2807–2815. doi: 10.1021/ci500406k CrossRefGoogle Scholar
  52. 52.
    Li H, Leung K-S, Wong M-H, Ballester PJ (2014) The importance of the regression model in the structure-based prediction of protein-ligand binding. In: Computational intelligence methods for bioinformatics and biostatistics, pp 219–230. Berlin: SpringerGoogle Scholar
  53. 53.
    Mysinger MM, Carchia M, Irwin JJ, Shoichet BK (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem 55(14):6582–94. doi: 10.1021/jm300687e CrossRefGoogle Scholar
  54. 54.
    rdkit. RDKit: Open-source cheminformatics. (Accessed 4Sep 2015)
  55. 55.
    Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R et al (2014) Qsar modeling: where have you been? Where are you going to? J Med Chem 57(12):4977–5010. doi: 10.1021/jm4004285 CrossRefGoogle Scholar
  56. 56.
    Patrícia Bento A, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Krüger FA, Light Y, Mak L, McGlinchey S, Nowotka M, Papadatos G, Santos R, Overington JP (2013) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42(D1):D1083–D1090. doi: 10.1093/nar/gkt1031 CrossRefGoogle Scholar
  57. 57.
    Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742–754. doi: 10.1021/ci100050t CrossRefGoogle Scholar
  58. 58.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830Google Scholar
  59. 59.
    O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open babel: an open chemical toolbox. J Cheminform 3:33. doi: 10.1186/1758-2946-3-33 CrossRefGoogle Scholar
  60. 60.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093
  61. 61.
    Choi Y, Deane CM (2009) FREAD revisited: accurate loop structure prediction using a database search algorithm. Proteins. doi: 10.1002/prot.22658. URL
  62. 62.
    Tan L, Geppert H, Sisay MT, Gütschow M, Bajorath J (2008) Integrating structure- and ligand-based virtual screening: comparison of individual, parallel, and fused molecular docking and similarity search calculations on multiple targets. ChemMedChem 3(10):1566–1571. doi: 10.1002/cmdc.200800129 CrossRefGoogle Scholar
  63. 63.
    Lusci A, Pollastri G, Baldi P (2013) Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J Chem Inf Model 53(7):1563–1575. doi: 10.1021/ci400187y CrossRefGoogle Scholar
  64. 64.
    Chen B, Harrison RF, Papadatos G, Willett P, Wood DJ, Lewell XQ, Greenidge P, Stiefl N (2007) Evaluation of machine-learning methods for ligand-based virtual screening. J Comput Aided Mol Des 21(1–3):53–62. doi: 10.1007/s10822-006-9096-5 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Department of Computational and Systems Biology, School of MedicineUniversity of PittsburghPittsburghUSA

Personalised recommendations