Are predefined decoy sets of ligand poses able to quantify scoring function accuracy?
- 321 Downloads
Due to the large number of different docking programs and scoring functions available, researchers are faced with the problem of selecting the most suitable one when starting a structure-based drug discovery project. To guide the decision process, several studies comparing different docking and scoring approaches have been published. In the context of comparing scoring function performance, it is common practice to use a predefined, computer-generated set of ligand poses (decoys) and to reevaluate their score using the set of scoring functions to be compared. But are predefined decoy sets able to unambiguously evaluate and rank different scoring functions with respect to pose prediction performance? This question arose when the pose prediction performance of our piecewise linear potential derived scoring functions (Korb et al. in J Chem Inf Model 49:84–96, 2009) was assessed on a standard decoy set (Cheng et al. in J Chem Inf Model 49:1079–1093, 2009). While they showed excellent pose identification performance when they were used for rescoring of the predefined decoy conformations, a pronounced degradation in performance could be observed when they were directly applied in docking calculations using the same test set. This implies that on a discrete set of ligand poses only the rescoring performance can be evaluated. For comparing the pose prediction performance in a more rigorous manner, the search space of each scoring function has to be sampled extensively as done in the docking calculations performed here. We were able to identify relative strengths and weaknesses of three scoring functions (ChemPLP, GoldScore, and Astex Statistical Potential) by analyzing the performance for subsets of the complexes grouped by different properties of the active site. However, reasons for the overall poor performance of all three functions on this test set compared to other test sets of similar size could not be identified.
KeywordsDocking Ranking Conformational space Sampling Active-site properties
The authors thank Renxiao Wang for providing the diverse test set of 195 protein–ligand complexes as well as Colin Groom and John Liebeschuetz for helpful discussions. The work was supported by the Konstanz Research School Chemical Biology (KoRS-CB), the Zukunftskolleg and the Young Scholar Fund of the Universität Konstanz. O.K. acknowledges support of the Landesgraduiertenförderung Baden-Württemberg and the Postdoc-Programme of the German Academic Exchange Service (DAAD). Additionally, we thank the Common Ulm Stuttgart Server (CUSS) and the Baden-Württemberg grid (bwGRiD), which is part of the D-Grid system, for providing the computer resources making the computations possible.
- 25.Zhong S, Zhang Y, Xiu Z (2010) Curr Opin Drug Discov Devel 13(3):326–334Google Scholar
- 49.Nelder JA, Mead R (1965) Comput J 7:308–313Google Scholar
- 52.Waldherr-Teschner M, Goetze T, Heiden W, Knoblauch M, Vollhardt H, Brickmann J (1992) MOLCAD—computer aided visualization and manipulation of models in molecular science. In: Post FH, Hin AJS (eds) Advances in scientific visualization. Springer Verlag, Heidelberg, pp 58–67Google Scholar
- 53.Brickmann J, Goetze T, Heiden W, Moeckel G, Reiling S, Vollhardt H, Zachmann C-D (1995) Interactive Visualization of Molecular Scenarios with MOLCAD/SYBYL. In: Bowie JE (ed) Data visualisation in molecular science: tools for insight and innovation. Addison-Wesley Publishing Company Inc., Reading, Mass, pp 83–97Google Scholar
- 55.Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B (2007) KNIME: the Konstanz information miner. In: Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R (eds) Studies in classification, data analysis, and knowledge organization (GfKL 2007). Springer, pp 319–326Google Scholar