Skip to main content
Log in

Fine tuning for success in structure-based virtual screening

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript


Structure-based virtual screening plays a significant role in drug-discovery. The method virtually docks millions of compounds from corporate or public libraries into a binding site of a disease-related protein structure, allowing for the selection of a small list of potential ligands for experimental testing. Many algorithms are available for docking and assessing the affinity of compounds for a targeted protein site. The performance of affinity estimation calculations is highly dependent on the size and nature of the site, therefore a rationale for selecting the best protocol is required. To address this issue, we have developed an automated calibration process, implemented in a Knime workflow. It consists of four steps: preparation of a protein test set with structures and models of the target, preparation of a compound test set with target-related ligands and decoys, automatic test of 24 scoring/rescoring protocols for each target structure and model, and graphical display of results. The automation of the process combined with execution on high performance computing resources greatly reduces the duration of the calibration phase, and the test of many combinations of algorithms on various target conformations results in a rational and optimal choice of the best protocol. Here, we present this tool and exemplify its application in setting-up an optimal protocol for SBVS against Retinoid X Receptor alpha.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

All data generated or analysed during this study are included in this published article and its supplementary information files.

Code availability

The Knime workflow Evotec_StructureBasedVirtualScreening_Calibration generated during the current study is available on Knime HUB



Area under the curve


Enrichment factor


Fragment molecular orbital


Hydrogen-bond acceptor


Hydrogen-bond donor


Homology model


High throughput screening


Monte Carlo


Molecular dynamics


Molecular weight


Protein Data Bank


Root mean square deviation


Receiver operating characteristic


Structure-based virtual screening


Scoring function


Virtual screening


  1. Bergner A, Cockcroft X, Fischer G et al (2019) KRAS binders hidden in nature. Chemistry 25:12037–12041.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Bürli RW, Wei H, Ernst G et al (2018) Novel inhibitors of As(III) S-adenosylmethionine methyltransferase (AS3MT) identified by virtual screening. Bioorg Med Chem Lett 28:3231–3235.

    Article  CAS  PubMed  Google Scholar 

  3. Slater O, Kontoyianni M (2019) The compromise of virtual screening and its impact on drug discovery. Expert Opin Drug Discov 14:619–637.

    Article  CAS  PubMed  Google Scholar 

  4. Wucherer-Plietker M, Merkul E, Müller TJJ et al (2016) Discovery of novel 7-azaindoles as PDK1 inhibitors. Bioorg Med Chem Lett 26:3073–3080.

    Article  CAS  PubMed  Google Scholar 

  5. Stumpfe D, Ripphausen P, Bajorath J (2012) Virtual compound screening in drug discovery. Future Med Chem 4:593–602.

    Article  CAS  PubMed  Google Scholar 

  6. Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28:235–242

    Article  CAS  Google Scholar 

  7. Diaz C, Angelloz-Nicoud P, Pihan E (2018) Modeling and deorphanization of orphan GPCRs. Methods Mol Biol 1705:413–429.

    Article  CAS  PubMed  Google Scholar 

  8. Gaulton A, Hersey A, Nowotka ML et al (2017) The ChEMBL database in 2017. Nucleic Acids Res 45:D945–D954.

    Article  CAS  PubMed  Google Scholar 

  9. Spyrakis F, Cavasotto CN (2015) Open challenges in structure-based virtual screening: receptor modeling, target flexibility consideration and active site water molecules description. Arch Biochem Biophys 583:105–119.

    Article  CAS  PubMed  Google Scholar 

  10. Fang Y, Ding Y, Feinstein WP et al (2016) GeauxDock: accelerating structure-based virtual screening with heterogeneous computing. PLoS ONE.

    Article  PubMed  PubMed Central  Google Scholar 

  11. PubMed. Accessed 4 Oct 2020

  12. RCSB PDB. Accessed 4 Oct 2020

  13. THE CHEMBL-OG. Accessed 4 Oct 2020

  14. TechPowerUp. GPU Specs Database. Accessed 4 Oct 2020

  15. Scior T, Bender A, Tresadern G et al (2012) Recognizing pitfalls in virtual screening: a critical review. J Chem Inf Model 52:867–881.

    Article  CAS  PubMed  Google Scholar 

  16. Forli S (2015) Charting a path to success in virtual screening. Molecules 20:18732–18758.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Bolcato G, Cuzzolin A, Bissaro M et al (2019) Can we still trust docking results? An extension of the applicability of dockbench on PDB bind Database. Int J Mol Sci 20:3558.

    Article  CAS  PubMed Central  Google Scholar 

  18. Li Y, Han L, Liu Z, Wang R (2014) Comparative assessment of scoring functions on an updated benchmark: 2. Evaluation methods and general results. J Chem Inf Model 54:1717–1736.

    Article  CAS  PubMed  Google Scholar 

  19. Su M, Yang Q, Du Y et al (2019) Comparative assessment of scoring functions: the CASF-2016 update. J Chem Inf Model 59:895–913.

    Article  CAS  PubMed  Google Scholar 

  20. Wingert BM, Camacho CJ (2018) Improving small molecule virtual screening strategies for the next generation of therapeutics. Curr Opin Chem Biol 44:87–92.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Lagarde N, Zagury JF, Montes M (2015) Benchmarking data sets for the evaluation of virtual ligand screening methods: review and perspectives. J Chem Inf Model 55:1297–1307.

    Article  CAS  PubMed  Google Scholar 

  22. Weiss DR, Bortolato A, Tehan B, Mason JS (2016) GPCR-Bench: A benchmarking set and practitioners’ guide for G protein-coupled receptor docking. J Chem Inf Model 56:642–651.

    Article  CAS  PubMed  Google Scholar 

  23. Cuzzolin A, Sturlese M, Malvacio I et al (2015) DockBench: an integrated informatic platform bridging the gap between the robust validation of docking protocols and virtual screening simulations. Molecules 20:9977–9993.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Bullock CW, Jacob RB, McDougal OM et al (2010) Dockomatic - automated ligand creation and docking. BMC Res Notes 3:289.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Ballante F, Marshall GR (2016) An automated strategy for binding-pose selection and docking assessment in structure-based drug design. J Chem Inf Model 56:54–72.

    Article  CAS  PubMed  Google Scholar 

  26. Berthold MR, Cebron N, Dill F et al (2007) KNIME: the Konstanz information miner. In: Rosteck V (ed) Studies in classification, data analysis, and knowledge organization. Springer, New York

    Google Scholar 

  27. Liu Z, Li Y, Han L et al (2015) PDB-wide collection of binding data: current status of the PDBbind database. Bioinformatics 31:405–412.

    Article  CAS  PubMed  Google Scholar 

  28. Mysinger MM, Carchia M, Irwin JJ, Shoichet BK (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem 55:6582–6594.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Lionta E, Spyrou G, Vassilatis D, Cournia Z (2014) Structure-based virtual screening for drug discovery: principles, applications and recent advances. Curr Top Med Chem 14:1923–1938.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Kotev M, Pascual R, Almansa C et al (2018) Pushing the limits of computational structure-based drug design with a cryo-EM structure: the Ca2+ channel α2δ-1 subunit as a test case. J Chem Inf Model 58:1707–1715.

    Article  CAS  PubMed  Google Scholar 

  31. Schmidtke P, Souaille C, Estienne F et al (2010) Large-scale comparison of four binding site detection algorithms. J Chem Inf Model 50:2191–2200.

    Article  CAS  PubMed  Google Scholar 

  32. De Vivo M, Masetti M, Bottegoni G, Cavalli A (2016) Role of molecular dynamics and related methods in drug discovery. J Med Chem 59:4035–4061.

    Article  CAS  PubMed  Google Scholar 

  33. Kotev M, Soliva R, Orozco M (2016) Challenges of docking in large, flexible and promiscuous binding sites. Bioorg Med Chem 24:4961–4969.

    Article  PubMed  Google Scholar 

  34. Vajda S, Beglov D, Wakefield AE et al (2018) Cryptic binding sites on proteins: definition, detection, and druggability. Curr Opin Chem Biol 44:1–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Bakan A, Nevins N, Lakdawala AS, Bahar I (2012) Druggability assessment of allosteric proteins by dynamics simulations in the presence of probe molecules. J Chem Theory Comput 8:2435–2447.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Yang YI, Shao Q, Zhang J et al (2019) Enhanced sampling in molecular dynamics. J Chem Phys.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Herbert C, Schieborr U, Saxena K et al (2013) Molecular mechanism of SSR128129E, an extracellularly acting, small-molecule, allosteric inhibitor of fgf receptor signaling. Cancer Cell 23:489–501.

    Article  CAS  PubMed  Google Scholar 

  38. Ghanakota P, Van Vlijmen H, Sherman W, Beuming T (2018) Large-scale validation of mixed-solvent simulations to assess hotspots at protein–protein interaction interfaces. J Chem Inf Model 58:784–793.

    Article  CAS  PubMed  Google Scholar 

  39. Bateman A (2019) UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 47:D506–D515.

    Article  CAS  Google Scholar 

  40. Chemical Computing Group ULC (2020) Molecular Operating Environment (MOE), 2019.01

  41. Waterhouse A, Bertoni M, Bienert S et al (2018) SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 46:W296–W303.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Yang J, Yan R, Roy A et al (2014) The I-TASSER suite: protein structure and function prediction. Nat Methods 12:7–8

    Article  Google Scholar 

  43. Vila-Farrés X, Parra-Millán R, Sánchez-Encinales V et al (2017) Combating virulence of Gram-negative bacilli by OmpA inhibition. Sci Rep.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Quezada LL, Silve S, Kelinske M et al (2019) Bactericidal disruption of magnesium metallostasis in Mycobacterium tuberculosis is counteracted by mutations in the metal ion transporter CorA. MBio.

    Article  Google Scholar 

  45. Kotev M, Sarrat L, Gonzalez CD (2020) User-friendly quantum mechanics: applications for drug discovery. Methods Mol Biol 2114:231–255.

    Article  CAS  PubMed  Google Scholar 

  46. Cereto-Massagué A, Guasch L, Valls C et al (2012) DecoyFinder: an easy-to-use python GUI application for building target-specific decoy sets. Bioinformatics 28:1661–1662.

    Article  CAS  PubMed  Google Scholar 

  47. Irwin JJ, Shoichet BK (2005) ZINC - A free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. MolPort. Accessed 4 Oct 2020

  49. Cleves AE, Jain AN (2020) Structure- and ligand-based virtual screening on DUD-E+: performance dependence on approximations to the binding pocket. J Chem Inf Model 60:4296–4310.

    Article  CAS  PubMed  Google Scholar 

  50. Maia EHB, Assis LC, de Oliveira TA et al (2020) Structure-based virtual screening: from classical to artificial intelligence. Front Chem 8:343.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Ferreira LG, Dos Santos RN, Oliva G, Andricopulo AD (2015) Molecular docking and structure-based drug design strategies. Molecules 20:13384–13421.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Pagadala NS, Syed K, Tuszynski J (2017) Software for molecular docking: a review. Biophys Rev 9:91–102

    Article  CAS  Google Scholar 

  53. Warren GL, Andrews CW, Capelli AM et al (2006) A critical assessment of docking programs and scoring functions. J Med Chem 49:5912–5931.

    Article  CAS  PubMed  Google Scholar 

  54. RDKit: Open-Source Cheminformatics. Accessed 4 Oct 2020

  55. ChemAxon. Accessed 4 Oct 2020

  56. Jones G, Willett P, Glen RC et al (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267:727–748.

    Article  CAS  PubMed  Google Scholar 

Download references


This work was part of a Lean Initiative at Evotec. We thank Danielle De Boyer-Montegut for her support during all the phases of this lean project, aiming to easy the calibration of SBVS projects. We also warmly thank Jon Ainsley for the manuscript revision.


No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Emilie Pihan.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pihan, E., Kotev, M., Rabal, O. et al. Fine tuning for success in structure-based virtual screening. J Comput Aided Mol Des 35, 1195–1206 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: