How to Benchmark Methods for Structure-Based Virtual Screening of Large Compound Libraries

  • Andrew J. Christofferson
  • Niu Huang
Part of the Methods in Molecular Biology book series (MIMB, volume 819)


Structure-based virtual screening is a useful computational technique for ligand discovery. To systematically evaluate different docking approaches, it is important to have a consistent benchmarking protocol that is both relevant and unbiased. Here, we describe the designing of a benchmarking data set for docking screen assessment, a standard docking screening process, and the analysis and presentation of the enrichment of annotated ligands among a background decoy database.

Key words

Virtual screening Molecular docking Enrichment Decoys 



The Chinese Ministry of Science and Technology “863” Grant 2008AA022313 (to N.H.) is acknowledged for financial support and Shoichet Lab at UCSF for the DOCK3.5.54 program.


  1. 1.
    Taylor RD, et al. (2002) A review of protein-small molecule docking methods. J Comput Aided Mol Des 16, 151–66.PubMedCrossRefGoogle Scholar
  2. 2.
    Shoichet BK (2004) Virtual screening of chemical libraries. Nature 432, 862–865.PubMedCrossRefGoogle Scholar
  3. 3.
    Leach AR, et al. (2006) Prediction of protein-ligand interactions. Docking and scoring: successes and gaps. J Med Chem 49, 5851–5.PubMedCrossRefGoogle Scholar
  4. 4.
    Joseph-McCarthy D, et al. (2007) Lead optimization via high-throughput molecular docking. Curr Opin Drug Discov Devel 10, 264–74.PubMedGoogle Scholar
  5. 5.
    Mohan V, et al. (2005) Docking: successes and challenges. Curr Pharm Des 11, 323–33.PubMedCrossRefGoogle Scholar
  6. 6.
    Verdonk ML, et al. (2004) Virtual screening using protein-ligand docking: Avoiding artificial enrichment. J Chem Inf Comput Sci 44, 793–806.PubMedCrossRefGoogle Scholar
  7. 7.
    Huang N, et al. (2006) Benchmarking Sets for Molecular Docking. J Med Chem 49, 6789–6801.PubMedCrossRefGoogle Scholar
  8. 8.
    Jain AN (2008) Bias, reporting, and sharing: computational evaluations of docking methods. J Comput Aided Mol Des 22, 201–12.PubMedCrossRefGoogle Scholar
  9. 9.
    Jain AN, and Nicholls A (2008) Recommendations for evaluation of computational methods. J Comput Aided Mol Des 22, 133–9.PubMedCrossRefGoogle Scholar
  10. 10.
    Cleves AE, and Jain AN (2008) Effects of inductive bias on computational evaluations of ligand-based modeling and on drug discovery. J Comput Aided Mol Des 22, 147–59.PubMedCrossRefGoogle Scholar
  11. 11.
    Liebeschuetz JW (2008) Evaluating docking programs: keeping the playing field level. J Comput Aided Mol Des 22, 229–38.PubMedCrossRefGoogle Scholar
  12. 12.
    Sheridan RP, et al. (2008) Multiple protein structures and multiple ligands: effects on the apparent goodness of virtual screening results. J Comput Aided Mol Des 22, 257–65.PubMedCrossRefGoogle Scholar
  13. 13.
    Irwin JJ (2008) Community benchmarks for virtual screening. J Comput Aided Mol Des 22, 193–199.PubMedCrossRefGoogle Scholar
  14. 14.
    Nicholls A (2008) What do we know and when do we know it? J Comput Aided Mol Des 22, 239–55.PubMedCrossRefGoogle Scholar
  15. 15.
    Hawkins PC, et al. (2008) How to do an evaluation: pitfalls and traps. J Comput Aided Mol Des 22, 179–90.PubMedCrossRefGoogle Scholar
  16. 16.
    Good AC, and Opera TI (2008) Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? J Comput Aided Mol Des 22, 169–178.PubMedCrossRefGoogle Scholar
  17. 17.
    Rohrer SG, and Baumann K (2008) Impact of Benchmark Data Set Topology on the Validation of Virtual Screening Methods: Exploration and Quantification by Spatial Statistics. J. Chem. Inf. Model. 48, 704–718.PubMedCrossRefGoogle Scholar
  18. 18.
    Irwin JJ, and Shoichet BK (2005) ZINC--a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 45, 177–82.PubMedCrossRefGoogle Scholar
  19. 19.
    Lorber DM, and Shoichet BK (2005) Hierarchical docking of databases of multiple ligand conformations. Curr Top Med Chem 5, 739–749.PubMedCrossRefGoogle Scholar
  20. 20.
    Lorber DM, and Shoichet BK (1998) Flexible ligand docking using conformational ensembles. Protein Sci. 7, 938–950.PubMedCrossRefGoogle Scholar
  21. 21.
    Wei BQ, et al. (2002) A model binding site for testing scoring functions in molecular docking. J Mol Biol 322, 339–355.PubMedCrossRefGoogle Scholar
  22. 22.
    Irwin JJ, et al. (2009) Automated docking screens: a feasibility study. J Med Chem 52, 5712–20.PubMedCrossRefGoogle Scholar
  23. 23.
    Berman HM, et al. (2000) The Protein Data Bank. Nucleic Acid Res 28, 235–242.PubMedCrossRefGoogle Scholar
  24. 24.
    Ihlenfeldt WD, et al. (1994) Computation and management of chemical properties in CACTVS: An extensible networked approach toward modularity and flexibility. J Chem Inf Comput Sci 34, 109–116.CrossRefGoogle Scholar
  25. 25.
    Voigt JH, et al. (2001) Comparison of the NCI open database with seven large chemical structural databases. J Chem Inf Comput Sci 41, 702–712.PubMedCrossRefGoogle Scholar
  26. 26.
    Connolly ML (1983) Solvent-accessible surfaces of proteins and nucleic acids. Science 221, 709–713.PubMedCrossRefGoogle Scholar
  27. 27.
    Ferrin TE, et al. (1988) The MIDAS display system. J Mol Graph 6, 13–27.CrossRefGoogle Scholar
  28. 28.
    Kuntz ID, et al. (1982) A geometric approach to macromolecule-ligand interactions. J Mol Biol 161, 269–288.PubMedCrossRefGoogle Scholar
  29. 29.
    Meng EC, et al. (1992) Automated docking with grid-based energy evaluation. J Comput Chem 13, 505–524.CrossRefGoogle Scholar
  30. 30.
    Nicholls A, and Honig B (1991) A rapid finite-difference algorithm, utilizing successive over-relaxation to solve the Poisson-Boltzmann equation. J Comput Chem 12, 435–445.CrossRefGoogle Scholar
  31. 31.
    McGaughey G, et al. (2007) Comparison of topological, shape, and docking methods in virtual screening. J Chem Inf Model 47, 1504–1519.PubMedCrossRefGoogle Scholar
  32. 32.
    Hawkins P, et al. (2007) Comparison of shape-matching and docking as virtual screening tools. J Med Chem 50, 74–82.PubMedCrossRefGoogle Scholar
  33. 33.
    Irwin JJ, and Shoichet BK (2005) ZINC--A free database of commercially available compounds for virtual screening. J Chem Inf Model 45, 177–182.PubMedCrossRefGoogle Scholar
  34. 34.
    Nicholls A (2008) What do we know and when do we know it? J Comput Aided Mol Des 22, 239–255.PubMedCrossRefGoogle Scholar
  35. 35.
    van Drie J (2003) Pharmacophore discovery - lessons learned. Curr Pharm Des 9, 1649–1664.PubMedCrossRefGoogle Scholar
  36. 36.
    Jain AN, and Nicholls A (2008) Recommendations for evaluation of computational methods. J Comput Aided Mol Des 22, 133–139.PubMedCrossRefGoogle Scholar
  37. 37.
    Triballeau N, et al. (2005) Virtual screening workflow development guided by the “receiver operating characteristic” curve approach. Application to high-throughput docking on metabotropic glutamate receptor subtype 4. J Med Chem 48, 2534–2547.PubMedCrossRefGoogle Scholar
  38. 38.
    Ferrari AM, et al. (2004) Soft docking and multiple receptor conformations in virtual screening. J Med Chem 47, 5076–5084.PubMedCrossRefGoogle Scholar
  39. 39.
    Cole JC, et al. (2005) Comparing protein-ligand docking programs is difficult. Proteins 60, 325–32.PubMedCrossRefGoogle Scholar
  40. 40.
    Kirchmair J, et al. (2008) Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection--What can we learn from earlier mistakes? J Comput Aided Mol Des 22, 213–228.PubMedCrossRefGoogle Scholar
  41. 41.
    Enyedy IJ, and Egan WJ (2007) Can we use docking and scoring for hit-to-lead optimization? J Comput Aided Mol Des 22, 161–168.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.National Institute of Biological SciencesBeijingPeople Republic of China

Personalised recommendations