Abstract
In cheminformatics, protein-ligand docking is a powerful tool applied for virtual screening, pose prediction, and binding affinity estimation. However, docking results depend on the quality of the crystal protein structures. Although several crystallographic indices can assist the selection of proteins, these are often misused by non-experts or have little applicability. Hereby, we propose the B-factor index for the binding site (BFIbs), which indicates the atomic fluctuations of the atoms in the binding site in comparison with atoms in the protein. Using an automated docking workflow, we performed docking experiments on 26,019 protein-ligand complexes. The docking performances were analyzed based on five crystallographic quality indices, i.e., BFIbs, DPI (the diffraction–component precision index), DPIbs (DPI of the binding site atoms), RRfree (R–Rfree), and the resolution of the X-ray crystal structure. Only BFIbs was found to significantly correlate with the root–mean–square deviation (RMSD) computed between the ligand poses in the crystal structures and the predicted docking poses. The majority of the best docking results (RMSD < 2 Å) were indicated by BFIbs < 1. We conclude that BFIbs, as a simple and interpretable parameter that complements other indices, can help to effectively prioritize protein structures for structure-based cheminformatics.
Similar content being viewed by others
Availability of data and material
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code availability
The code used for data generation and/or analysis within the current study is available from the corresponding author on reasonable request.
References
Dauter Z, Wlodawer A (2016) Progress in protein crystallography. Protein Pept Lett 23:201–210
Dutta S, Burkhardt K, Young J et al (2009) Data deposition and annotation at the worldwide protein data bank. Mol Biotechnol 42:1–13
Berman HM, Westbrook J, Feng Z et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242
Pinzi L, Rastelli G (2019) Molecular docking: shifting paradigms in drug discovery. Int J Mol Sci 20. https://doi.org/10.3390/ijms20184331
Erickson JA, Jalaie M, Robertson DH et al (2004) Lessons in molecular recognition: the effects of ligand and protein flexibility on molecular docking accuracy. J Med Chem 47:45–55
Berman H, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10:980
Berman H, Henrick K, Nakamura H, Markley JL (2007) The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res 35:D301–D303
Weiss MS (2001) Global indicators of X-ray data quality. J Appl Crystallogr 34:130–135
Kleywegt GJ (2000) Validation of protein crystal structures. Acta Crystallogr D Biol Crystallogr 56:249–265
Cruickshank DW (1999) Remarks about protein structure precision. Acta Crystallogr D Biol Crystallogr 55:583–601
Warren GL, Do TD, Kelley BP et al (2012) Essential considerations for using protein–ligand structures in drug discovery. Drug Discov Today 17:1270–1281
Blow DM (2002) Rearrangement of Cruickshank’s formulae for the diffraction-component precision index. Acta Crystallogr D Biol Crystallogr 58:792–797
Hawkins PCD, Skillman AG, Warren GL et al (2010) Conformer generation with OMEGA: algorithm and validation using high quality structures from the Protein Data Bank and Cambridge Structural Database. J Chem Inf Model 50:572–584
Hawkins PCD, Warren GL, Skillman AG, Nicholls A (2008) How to do an evaluation: pitfalls and traps. J Comput Aided Mol Des 22:179–190
Avram SI, Crisan L, Pacureanu LM et al (2013) Challenges in docking 2′-hydroxy and 2′,4′-dihydroxychalcones into the binding site of ALR2. Med Chem Res 22:3589–3605
Kumar KSD, Gurusaran M, Satheesh SN et al (2015) Online_DPI: a web server to calculate the diffraction precision index for a protein structure. J Appl Crystallogr 48:939–942
Gurusaran M, Shankar M, Nagarajan R et al (2014) Do we see what we should see? Describing non-covalent interactions in protein structures including precision. IUCrJ 1:74–81
Goto J, Kataoka R, Hirayama N (2004) Ph4Dock: pharmacophore-based protein- ligand docking. J Med Chem 47:6804–6811
Brünger AT (1992) Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature 355:472–475
Li X, Li Y, Cheng T et al (2010) Evaluation of the performance of four molecular docking programs on a diverse set of protein-ligand complexes. J Comput Chem 31:2109–2125
Sun Z, Liu Q, Qu G et al (2019) Utility of B-Factors in protein science: interpreting rigidity, flexibility, and internal motion and engineering thermostability. Chem Rev 119:1626–1665
Burley SK et al (2021) RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res 49:D437–D451
Burley SK, Berman HM, Bhikadiya C et al (2019) RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res 47:D464–D474
Berthold MR et al (2008) KNIME: the Konstanz information Miner. In: Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R (eds) Data analysis, machine learning and applications. Studies in classification, data analysis, and knowledge organization. Springer, Berlin, Heidelberg, pp 319–326
Grant BJ, Rodrigues APC, ElSawy KM et al (2006) Bio3d: an R package for the comparative analysis of protein structures. Bioinformatics 22:2695–2696
Friesner RA et al (2006) Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. J Med Chem 49:6177–6196
Morris GM, Huey R, Lindstrom W et al (2009) AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J Comput Chem 30:2785–2791
Feinstein WP, Brylinski M (2015) Calculating an optimal box size for ligand docking and virtual screening against experimental and predicted binding pockets. J Cheminform 7:18
O’Boyle NM, Banck M, James CA et al (2011) Open Babel: an open chemical toolbox. J Cheminform 3:33
Nissink JWM, Murray C, Hartshorn M et al (2002) A new test set for validating predictions of protein--ligand interaction. Proteins: Struct Funct Bioinf 49:457–471
Acknowledgements
The authors wish to thank Dr. Ramona Curpan for constructive comments and suggestions regarding the manuscript.
Funding
This work was supported by project number 1.1.4/2019/2020, of “Coriolan Drăgulescu” Institute of Chemistry, Timișoara, Romanian Academy, Romania.
Author information
Authors and Affiliations
Contributions
Cristian Neanu and Sorin Avram contributed to the conception and design of this study. Liliana Halip and Cristian Neanu performed the molecular docking experiment and Sorin Avram developed the decision tree. All authors contributed to the manuscript writing and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
ESM 1
(PDF 125 kb).
Rights and permissions
About this article
Cite this article
Halip, L., Avram, S. & Neanu, C. The B-factor index for the binding site (BFIbs) to prioritize crystal protein structures for docking. Struct Chem 32, 1693–1699 (2021). https://doi.org/10.1007/s11224-021-01751-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11224-021-01751-9