Abstract
Context
Protein–protein interaction (PPI) is a key component linked to virtually all cellular processes. Be it an enzyme catalysis (‘classic type functions’ of proteins) or a signal transduction (‘non-classic’), proteins generally function involving stable or quasi-stable multi-protein associations. The physical basis for such associations is inherent in the combined effect of shape and electrostatic complementarities (Sc, EC) of the interacting protein partners at their interface, which provides indirect probabilistic estimates of the stability and affinity of the interaction. While Sc is a necessary criterion for inter-protein associations, EC can be favorable as well as disfavored (e.g., in transient interactions). Estimating equilibrium thermodynamic parameters (∆Gbinding, Kd) by experimental means is costly and time consuming, thereby opening windows for computational structural interventions. Attempts to empirically probe ∆Gbinding from coarse-grain structural descriptors (primarily, surface area based terms) have lately been overtaken by physics-based, knowledge-based and their hybrid approaches (MM/PBSA, FoldX, etc.) that directly compute ∆Gbinding without involving intermediate structural descriptors.
Methods
Here, we present EnCPdock (https://www.scinetmol.in/EnCPdock/), a user-friendly web-interface for the direct conjoint comparative analyses of complementarity and binding energetics in proteins. EnCPdock returns an AI-predicted ∆Gbinding computed by combining complementarity (Sc, EC) and other high-level structural descriptors (input feature vectors), and renders a prediction accuracy comparable to the state-of-the-art. EnCPdock further locates a PPI complex in terms of its {Sc, EC} values (taken as an ordered pair) in the two-dimensional complementarity plot (CP). In addition, it also generates mobile molecular graphics of the interfacial atomic contact network for further analyses. EnCPdock also furnishes individual feature trends along with the relative probability estimates (Prfmax) of the obtained feature-scores with respect to the events of their highest observed frequencies. Together, these functionalities are of real practical use for structural tinkering and intervention as might be relevant in the design of targeted protein-interfaces. Combining all its features and applications, EnCPdock presents a unique online tool that should be beneficial to structural biologists and researchers across related fraternities.
Similar content being viewed by others
Data availability
Relevant tracking information for all entries in all datasets used can be found in the online Supplementary Material. Over and above this, any specific data that might be required can be made accessible on request.
Notes
Dissociation constant.
(Gibbs) Free energy of binding.
Molecular Mechanics combined with Poisson–Boltzmann electrostatics and accessible surface area estimates.
Molecular Mechanics combined with Generalized Born electrostatics and accessible surface area estimates.
SD: standard deviations.
Receptor and ligand each consisting of a single polypeptide chain.
References
Jones S, Thornton JM (1996) Principles of protein-protein interactions. Proc Natl Acad Sci U S A 93:13–20. https://doi.org/10.1073/pnas.93.1.13
Phillip Y, Schreiber G (2013) Formation of protein complexes in crowded environments – from in vitro to in vivo. FEBS Lett 587:1046–1052. https://doi.org/10.1016/j.febslet.2013.01.007
Homola J (2008) Surface plasmon resonance sensors for detection of chemical and biological species. Chem Rev 108:462–493. https://doi.org/10.1021/cr068107d
Navratilova I, Hopkins AL (2011) Emerging role of surface plasmon resonance in fragment-based drug discovery, Future. Med Chem 3:1809–1820. https://doi.org/10.4155/fmc.11.128
Fernández-Dueñas V, Llorente J, Gandía J, Borroto-Escuela DO, Agnati LF, Tasca CI, Fuxe K, Ciruela F (2012) Fluorescence resonance energy transfer-based technologies in the study of protein-protein interactions at the cell surface. Methods 57:467–472. https://doi.org/10.1016/j.ymeth.2012.05.007
Jelesarov I, Bosshard HR (1999) Isothermal titration calorimetry and differential scanning calorimetry as complementary tools to investigate the energetics of biomolecular recognition. J Mol Recognit 12:3–18. https://doi.org/10.1002/(SICI)1099-1352(199901/02)12:1%3c3::AID-JMR441%3e3.0.CO;2-6
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucl Acids Res 28:235–242. https://doi.org/10.1093/nar/28.1.235
Siebenmorgen T, Zacharias M (2020) Computational prediction of protein–protein binding affinities. WIREs Comput Mol Sci 10:48. https://doi.org/10.1002/wcms.1448
Abbasi WA, Yaseen A, Hassan FU, Andleeb S, Minhas FUAA (2020) ISLAND: in-silico proteins binding affinity prediction using sequence information. BioData Mining 13:20. https://doi.org/10.1186/s13040-020-00231-w
Venugopal V, Datta AK, Bhattacharyya D, Dasgupta D, Banerjee R (2009) Structure of cyclophilin from Leishmania donovani bound to cyclosporin at 2.6 A resolution: correlation between structure and thermodynamic data. Acta Crystallogr D Biol Crystallogr 65:1187–1195. https://doi.org/10.1107/S0907444909034234
Vreven T, Moal IH, Vangone A, Pierce BG, Kastritis PL, Torchala M, Chaleil R, Jiménez-García B, Bates PA, Fernandez-Recio J, Bonvin AMJJ, Weng Z (2015) Updates to the integrated protein–protein interaction benchmarks: docking benchmark version 5 and affinity benchmark version 2. J Mol Biol 427:3031–3041. https://doi.org/10.1016/j.jmb.2015.07.016
Kastritis PL, Moal IH, Hwang H, Weng Z, Bates PA, Bonvin AMJJ, Janin J (2011) A structure-based benchmark for protein–protein binding affinity. Protein Sci 20:482–491. https://doi.org/10.1002/pro.580
Vangone A, Bonvin AM (2015) Contacts-based prediction of binding affinity in protein-protein complexes. Elife 4:e07454. https://doi.org/10.7554/eLife.07454
Lazaridis T, Karplus M (2000) Effective energy functions for protein structure prediction. Curr Opin Struct Biol 10:139–145. https://doi.org/10.1016/s0959-440x(00)00063-4
Mendes J, Guerois R, Serrano L (2002) Energy estimation in protein design. Curr Opin Struct Biol 12:441–446. https://doi.org/10.1016/s0959-440x(02)00345-7
Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L (2005) The FoldX web server: an online force field. Nucleic Acids Res 33:W382-388. https://doi.org/10.1093/nar/gki387
Guerois R, Nielsen JE, Serrano L (2002) Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol 320:369–387. https://doi.org/10.1016/S0022-2836(02)00442-4
Guerois R, Serrano L (2000) The SH3-fold family: experimental evidence and prediction of variations in the folding pathways. J Mol Biol 304:967–982. https://doi.org/10.1006/jmbi.2000.4234
Wang E, Sun H, Wang J, Wang Z, Liu H, Zhang JZH, Hou T (2019) End-point binding free energy calculation with MM/PBSA and MM/GBSA: strategies and applications in drug design. Chem Rev 119:9478–9508. https://doi.org/10.1021/acs.chemrev.9b00055
Schreiber G (2002) Kinetic studies of protein-protein interactions. Curr Opin Struct Biol 12:41–47. https://doi.org/10.1016/s0959-440x(02)00287-7
Chen F, Liu H, Sun H, Pan P, Li Y, Li D, Hou T (2016) Assessing the performance of the MM/PBSA and MM/GBSA methods. 6. Capability to predict protein-protein binding free energies and re-rank binding poses generated by protein-protein docking. Phys Chem Chem Phys. 18:22129–22139. https://doi.org/10.1039/c6cp03670h
Bahadur RP, Chakrabarti P (2009) Discriminating the native structure from decoys using scoring functions based on the residue packing in globular proteins. BMC Struct Biol 9:76. https://doi.org/10.1186/1472-6807-9-76
Basu S, Bhattacharyya D, Banerjee R (2012) Self-complementarity within proteins: bridging the gap between binding and folding. Biophys J 102:2605–2614. https://doi.org/10.1016/j.bpj.2012.04.029
Gabb HA, Jackson RM, Sternberg MJE (1997) Modelling protein docking using shape complementarity, electrostatics and biochemical information11Edited by J. Thornton. J Mol Biol 272:106–120. https://doi.org/10.1006/jmbi.1997.1203
Basu S (2017) CPdock: the complementarity plot for docking of proteins: implementing multi-dielectric continuum electrostatics. J Mol Model 24:8. https://doi.org/10.1007/s00894-017-3546-y
Basu S, Chakravarty D, Bhattacharyya D, Saha P, Patra HK (2021) Plausible blockers of spike RBD in SARS-CoV2—molecular design and underlying interaction dynamics from high-level structural descriptors. J Mol Model 27:191. https://doi.org/10.1007/s00894-021-04779-0
Lawrence MC, Colman PM (1993) Shape complementarity at protein/protein interfaces. J Mol Biol 234:946–950. https://doi.org/10.1006/jmbi.1993.1648
McCoy AJ, Chandana Epa V, Colman PM (1997) Electrostatic complementarity at protein/protein interfaces. J Mol Biol 268:570–584
Zhang Q, Sanner M, Olson AJ (2009) Shape complementarity of protein-protein complexes at multiple resolutions. Proteins 75:453–467. https://doi.org/10.1002/prot.22256
Yan Y, Huang S-Y (2019) Pushing the accuracy limit of shape complementarity for protein-protein docking. BMC Bioinformatics 20:696. https://doi.org/10.1186/s12859-019-3270-y
Banerjee R, Sen M, Bhattacharya D, Saha P (2003) The jigsaw puzzle model: search for conformational specificity in protein interiors. J Mol Biol 333:211–226
Dell’Orco D, Xue W-F, Thulin E, Linse S (2005) Electrostatic contributions to the kinetics and thermodynamics of protein assembly. Biophys J 88:1991–2002. https://doi.org/10.1529/biophysj.104.049189
Zhou H-X, Pang X (2018) Electrostatic interactions in protein structure, folding, binding, and condensation. Chem Rev 118:1691–1741. https://doi.org/10.1021/acs.chemrev.7b00305
Basu S, Bhattacharyya D, Banerjee R (2014) Applications of complementarity plot in error detection and structure validation of proteins. Indian J Biochem Biophys 51:188–200
Basu S, Wallner B (2016) Finding correct protein-protein docking models using ProQDock. Bioinformatics 32:i262–i270. https://doi.org/10.1093/bioinformatics/btw257
Roy S, Ghosh P, Bandyopadhyay A, Basu S (2022) Capturing a crucial ‘disorder-to-order transition’ at the heart of the coronavirus molecular pathology—triggered by highly persistent, interchangeable salt-bridges. Vaccines 10:301. https://doi.org/10.3390/vaccines10020301
Biswas G, Ghosh S, Basu S, Bhattacharyya D, Datta AK, Banerjee R (2022) Can the jigsaw puzzle model of protein folding re-assemble a hydrophobic core? Proteins. https://doi.org/10.1002/prot.26321
Williams G (2018) Shape complementarity at protein interfaces via global docking optimisation. J Mol Graph Model 84:69–73. https://doi.org/10.1016/j.jmgm.2018.06.011
Michel-Todó L, Reche PA, Bigey P, Pinazo M-J, Gascón J, Alonso-Padilla J (2019) In silico design of an epitope-based vaccine ensemble for Chagas disease. Front Immunol 10(2698):2023. https://doi.org/10.3389/fimmu.2019.02698. (accessed May 6)
Kleywegt GJ, Jones TA (1996) Phi/Psi-chology: Ramachandran revisited. Structure 4:1395–1400. https://doi.org/10.1016/S0969-2126(96)00147-5
Ramachandran GN, Ramakrishnan C, Sasisekharan V (1963) Stereochemistry of polypeptide chain configurations. J Mol Biol 7:95–99. https://doi.org/10.1016/s0022-2836(63)80023-6
Jemimah S, Yugandhar K, Michael Gromiha M (2017) PROXiMATE: a database of mutant protein-protein complex thermodynamics and kinetics. Bioinformatics 33:2787–2788. https://doi.org/10.1093/bioinformatics/btx312
Jankauskaitė J, Jiménez-García B, Dapkūnas J, Fernández-Recio J, Moal IH (2019) SKEMPI 20: an updated benchmark of changes in protein–protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 35:462–469. https://doi.org/10.1093/bioinformatics/bty635
Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR, Keegan RM, Krissinel EB, Leslie AGW, McCoy A, McNicholas SJ, Murshudov GN, Pannu NS, Potterton EA, Powell HR, Read RJ, Vagin A, Wilson KS (2011) Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr 67:235–242. https://doi.org/10.1107/S0907444910045749
Xu D, Zhang Y (2009) Generating triangulated macromolecular surfaces by Euclidean distance transform. PLoS ONE 4:e8140. https://doi.org/10.1371/journal.pone.0008140
Li L, Li C, Sarkar S, Zhang J, Witham S, Zhang Z, Wang L, Smith N, Petukh M, Alexov E (2012) DelPhi: a comprehensive suite for DelPhi software and associated resources. BMC Biophys 5:9. https://doi.org/10.1186/2046-1682-5-9
Naccess homepage, (n.d.). http://www.bioinf.manchester.ac.uk/naccess/ (accessed April 6, 2022).
Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA (1995) A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J Am Chem Soc 117:5179–5197. https://doi.org/10.1021/ja00124a002
Basu S, Bhattacharyya D, Banerjee R (2011) Mapping the distribution of packing topologies within protein interiors shows predominant preference for specific packing motifs. BMC Bioinformatics 12:195. https://doi.org/10.1186/1471-2105-12-195
Basu S, Biswas P (2018) Salt-bridge dynamics in intrinsically disordered proteins: a trade-off between electrostatic interactions and structural flexibility. Biochim Biophys Acta (BBA) - Proteins Proteomics 1866:624–641
Geng C, Xue LC, Roel-Touris J, Bonvin AMJJ (2019) Finding the ΔΔG spot: are predictors of binding affinity changes upon mutations in protein–protein interactions ready for it? WIREs Comput Mol Sci 9:1410. https://doi.org/10.1002/wcms.1410
Joachims T (2002) Learning to classify text using support vector machines. Springer US, Boston, MA. https://doi.org/10.1007/978-1-4615-0907-3 (accessed November 14, 2015)
Vapnik VN (2000) The nature of statistical learning theory. Springer, New York, New York, NY. https://doi.org/10.1007/978-1-4757-3264-1
Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L (2005) The FoldX web server: an online force field. Nucleic Acids Res 33:382–388. https://doi.org/10.1093/nar/gki387
Buß O, Rudat J, Ochsenreither K (2018) FoldX as protein engineering tool: better than random based approaches? Comput Struct Biotechnol J 16:25–33. https://doi.org/10.1016/j.csbj.2018.01.002
Alford RF, Leaver-Fay A, Jeliazkov JR, O’Meara MJ, DiMaio FP, Park H, Shapovalov MV, Renfrew PD, Mulligan VK, Kappel K, Labonte JW, Pacella MS, Bonneau R, Bradley P, Dunbrack RL, Das R, Baker D, Kuhlman B, Kortemme T, Gray JJ (2017) The Rosetta all-atom energy function for macromolecular modeling and design. J Chem Theory Comput 13:3031–3048. https://doi.org/10.1021/acs.jctc.7b00125
Awad M, Khanna R (2015) Support vector regression. efficient learning machines. Apress, Berkeley, CA, pp 67–80. https://doi.org/10.1007/978-1-4302-5990-9_4
Schölkopf B, Tsuda K, Vert J-P eds. (2004) A primer on Kernel methods, in: Kernel Methods in Computational Biology, The MIT Press, https://doi.org/10.7551/mitpress/4057.003.0004
Karamizadeh S, Abdullah SM, Halimi M, Shayan J, Rajabi MJ (2014) Advantage and drawback of support vector machine functionality. 2014 International Conference on Computer, Communications, and Control Technology (I4CT). IEEE, Langkawi, Malaysia, pp 63–65. https://doi.org/10.1109/I4CT.2014.6914146
Doniger S, Hofmann T, Yeh J (2002) Predicting CNS permeability of drug molecules: comparison of neural network and support vector machine algorithms. J Comput Biol 9:849–864. https://doi.org/10.1089/10665270260518317
Choi D, Park B, Chae H, Lee W, Han K (2017) Predicting protein-binding regions in RNA using nucleotide profiles and compositions. BMC Syst Biol 11:16. https://doi.org/10.1186/s12918-017-0386-4
Cai Y-D, Liu X-J, Xu X, Zhou G-P (2001) Support vector machines for predicting protein structural class. BMC Bioinformatics 2:3. https://doi.org/10.1186/1471-2105-2-3
Boardman M, Trappenberg T (2006) A heuristic for free parameter optimization with support vector machines, in: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp 610–617. https://doi.org/10.1109/IJCNN.2006.246739
Wei Q, Dunbrack RL Jr (2013) The role of balanced training and testing data sets for binary classifiers in bioinformatics. PLOS ONE 8:e67863. https://doi.org/10.1371/journal.pone.0067863
Van Durme J, Delgado J, Stricher F, Serrano L, Schymkowitz J, Rousseau F (2011) A graphical interface for the FoldX forcefield. Bioinformatics 27:1711–1712. https://doi.org/10.1093/bioinformatics/btr254
Basak, D, Pal S, Patranabis DC (2007) Support vector regression, neural information processing 11
Koike A, Takagi T (2004) Prediction of protein-protein interaction sites using support vector machines. Protein Eng Des Sel 17:165–173. https://doi.org/10.1093/protein/gzh020
Aybey E, Gümüş Ö (2022) SENSDeep: an ensemble deep learning method for protein–protein interaction sites prediction. Interdiscip Sci Comput Life Sci. https://doi.org/10.1007/s12539-022-00543-x
Vreven T, Moal IH, Vangone A, Pierce BG, Kastritis PL, Torchala M, Chaleil R, Jiménez-García B, Bates PA, Fernandez-Recio J, Bonvin AMJJ, Weng Z (2015) Updates to the integrated protein-protein interaction benchmarks: docking benchmark version 5 and affinity benchmark version 2. J Mol Biol 427:3031–3041. https://doi.org/10.1016/j.jmb.2015.07.016
Janin J (2014) A minimal model of protein–protein binding affinities. Protein Sci 23:1813–1817. https://doi.org/10.1002/pro.2560
Morozov AV, Kortemme T, Baker D (2003) Evaluation of models of electrostatic interactions in proteins. J Phys Chem B 107:2075–2090. https://doi.org/10.1021/jp0267555
Moal IH, Fernandez-Recio J (2013) Intermolecular contact potentials for protein–protein interactions extracted from binding free energy changes upon mutation. J Chem Theory Comput 9:3715–3727. https://doi.org/10.1021/ct400295z
Moal IH, Agius R, Bates PA (2011) Protein–protein binding affinity prediction on a diverse set of structures. Bioinformatics 27:3002–3009. https://doi.org/10.1093/bioinformatics/btr513
Vreven T, Hwang H, Pierce BG, Weng Z (2012) Prediction of protein–protein binding free energies. Protein Sci 21:396–404. https://doi.org/10.1002/pro.2027
Kastritis PL, Rodrigues JPGLM, Folkers GE, Boelens R, Bonvin AMJJ (2014) Proteins feel more than they see: fine-tuning of binding affinity by properties of the non-interacting surface. J Mol Biol 426:2632–2652
Hou T, Wang J, Li Y, Wang W (2011) Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations. J Chem Inf Model 51:69–82. https://doi.org/10.1021/ci100275a
Eathiraj S, Pan X, Ritacco C, Lambright DG (2005) Structural basis of family-wide Rab GTPase recognition by rabenosyn-5. Nature 436:415–419. https://doi.org/10.1038/nature03798
Henry GD, Corrigan DJ, Dineen JV, Baleja JD (2010) Charge effects in the selection of NPF motifs by the EH domain of EHD1. Biochemistry 49:3381–3392. https://doi.org/10.1021/bi100065r
McPhalen CA, James MN (1988) Structural comparison of two serine proteinase-protein inhibitor complexes: eglin-c-subtilisin Carlsberg and CI-2-subtilisin Novo. Biochemistry 27:6582–6598
Krystek S, Stouch T, Novotny J (1993) Affinity and specificity of serine endopeptidase-protein inhibitor interactions. Empirical free energy calculations based on X-ray crystallographic structures. J Mol Biol 234:661–679. https://doi.org/10.1006/jmbi.1993.1619
Paesen GC, Siebold C, Harlos K, Peacey MF, Nuttall PA, Stuart DI (2007) A tick protein with a modified Kunitz fold inhibits human tryptase. J Mol Biol 368:1172–1186. https://doi.org/10.1016/j.jmb.2007.03.011
Maffucci I, Contini A (2016) Improved computation of protein–protein relative binding energies with the Nwat-MMGBSA method, ACS Publications. J Chem Inf 56(9):1692–1704. https://doi.org/10.1021/acs.jcim.6b00196
Panel N, Villa F, Fuentes EJ, Simonson T (2018) Accurate PDZ/peptide binding specificity with additive and polarizable free energy simulations. Biophys J. 114:1091–1102. https://doi.org/10.1016/j.bpj.2018.01.008
Shepherd TR, Hard RL, Murray AM, Pei D, Fuentes EJ (2011) Distinct ligand specificity of the Tiam1 and Tiam2 PDZ domains. Biochemistry 50:1296–1308. https://doi.org/10.1021/bi1013613
Murray AJ, Head JG, Barker JJ, Brady RL (1998) Engineering an intertwined form of CD2 for stability and assembly. Nat Struct Biol 5:778–782. https://doi.org/10.1038/1816
Acknowledgements
We convey our sincerest gratitude to Prof. Dhananjay Bhattacharyya (Saha Institute of Nuclear Physics, Kolkata, India, retired) for his time and thoughts on the matter in the course of one extremely helpful discussion during the revision.
Author information
Authors and Affiliations
Contributions
S.B. conceptualized the idea, designed the calculations, and wrote the main manuscript with help from G.B., N.D., and D.M. developed the web-server with assistance from S.B. N.D., and P.G. did the required literature survey. G.B. performed all training, validations, and actively participated in drafting the results and discussion. All the authors participated during the revisions, read, and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Biswas, G., Mukherjee, D., Dutta, N. et al. EnCPdock: a web-interface for direct conjoint comparative analyses of complementarity and binding energetics in inter-protein associations. J Mol Model 29, 239 (2023). https://doi.org/10.1007/s00894-023-05626-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00894-023-05626-0