Skip to main content

Feature Design for Protein Interface Hotspots Using KFC2 and Rosetta

  • Chapter
  • First Online:
  • 854 Accesses

Part of the book series: Association for Women in Mathematics Series ((AWMS,volume 17))

Abstract

Protein–protein interactions regulate many essential biological processes and play an important role in health and disease. The process of experimentally characterizing protein residues that contribute the most to protein–protein interaction affinity and specificity is laborious. Thus, developing models that accurately characterize hotspots at protein–protein interfaces provides important information about how to inhibit therapeutically relevant protein–protein interactions. During the course of the ICERM WiSDM workshop 2017, we combined the KFC2a protein–protein interaction hotspot prediction features with Rosetta scoring function terms and interface filter metrics. A two-way and three-way forward selection strategy was employed to train support vector machine classifiers, as was a reverse feature elimination strategy. From these results, we identified subsets of KFC2a and Rosetta combined features that show improved performance over KFC2a features alone.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   49.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. M.E. Abram, A.L. Ferris, W. Shao, W.G. Alvord, S.H. Hughes, Nature, position, and frequency of mutations made in a single cycle of HIV-1 replication. J. Virol. 84(19), 9864–9878 (2010)

    Article  Google Scholar 

  2. S. Ahmad, O. Keskin, K. Mizuguchi, A. Sarai, R. Nussinov, CCRXP: exploring clusters of conserved residues in protein structures. Nucleic Acids Res. 38(Web Server issue), W398–401 (2010)

    Google Scholar 

  3. R.F. Alford, A. Leaver-Fay, J.R. Jeliazkov, M.J. O’Meara, F.P. DiMaio, H. Park, M.V. Shapovalov, P.D. Renfrew, V.K. Mulligan, K. Kappel, J.W. Labonte, M.S. Pacella, R. Bonneau, P. Bradley, R.L. Dunbrack, R. Das, D. Baker, B. Kuhlman, T. Kortemme, J.J. Gray, The Rosetta all-atom energy function for macromolecular modeling and design. J. Chem. Theory Comput. 13(6), 3031–3048 (2017)

    Article  Google Scholar 

  4. S.A. Assi, T. Tanaka, T.H. Rabbitts, N. Fernandez-Fuentes, PCRPi: Presaging Critical Residues in Protein interfaces, a new computational tool to chart hot spots in protein interfaces. Nucleic Acids Res. 38(6), e86 (2010)

    Google Scholar 

  5. F. Bahram, N. von der Lehr, C. Cetinkaya, L.G. Larsson, c-Myc hot spot mutations in lymphomas result in inefficient ubiquitination and decreased proteasome-mediated turnover. Blood 95(6), 2104–2110 (2000)

    Google Scholar 

  6. A. Ben-Shimon, M. Eisenstein, Computational mapping of anchoring spots on protein surfaces. J. Mol. Biol. 402(1), 259–277 (2010)

    Article  Google Scholar 

  7. A.A. Bogan, K.S. Thorn, Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280(1), 1–9 (1998)

    Article  Google Scholar 

  8. R.T. Bradshaw, B.H. Patel, E.W. Tate, R.J. Leatherbarrow, I.R. Gould, Comparing experimental and computational alanine scanning techniques for probing a prototypical protein-protein interaction. Protein Eng. Des. Sel. 24(1–2), 197–207 (2011)

    Article  Google Scholar 

  9. A. Chevalier, D.A. Silva, G.J. Rocklin, D.R. Hicks, R. Vergara, P. Murapa, S.M. Bernard, L. Zhang, K.H. Lam, G. Yao et al., Massively parallel de novo protein design for targeted therapeutics. Nature 550(7674), 74–79 (2017)

    Article  Google Scholar 

  10. N. Christianini, J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods (Cambridge University Press, Cambridge, 2000)

    Book  Google Scholar 

  11. G.Y. Chuang, R. Mehra-Chaudhary, C.H. Ngan, B.S. Zerbe, D. Kozakov, S. Vajda, L.J. Beamer, Domain motion and interdomain hot spots in a multidomain enzyme. Protein Sci. 19(9), 1662–1672 (2010)

    Article  Google Scholar 

  12. E. Cukuroglu, A. Gursoy, O. Keskin, HotRegion: a database of predicted hot spot clusters. Nucleic Acids Res. 40(Database issue), D829–33 (2012)

    Article  Google Scholar 

  13. S.J. Darnell, D. Page, J.C. Mitchell, An automated decision-tree approach to predicting protein interaction hot spots. Proteins Struct. Funct. Bioinform. 68(4), 813–823 (2007)

    Article  Google Scholar 

  14. S.J. Darnell, L. LeGault, J.C. Mitchell, KFC server: interactive forecasting of protein interaction hot spots. Nucleic Acids Res. 36(Web Server issue), W265–W269 (2008)

    Google Scholar 

  15. W. DeLano, Unraveling hot spots in binding interfaces: progress and challenges. Curr. Opin. Struct. Biol. 12(1), 14–20 (2002)

    Article  Google Scholar 

  16. J.E. Donald, H. Zhu, R.I. Litvinov, W.F. DeGrado, J.S. Bennett, Identification of interacting hot spots in the beta3 integrin stalk using comprehensive interface design. J. Biol. Chem. 285(49), 38658–38665 (2010)

    Article  Google Scholar 

  17. A. Fischer, K. Arunachalam, V. Mangual, S. Bakhru, R. Russo, D. Huang, M. Paczkowski, V. Lalchandani, C. Ramachandra, B. Ellison, S. Galer, J. Shapley, E. Fuentes, J. Tsai, The binding interface database (BID): a compilation of amino acid hot spots in protein interfaces. Bioinformatics 19(11), 1453–1454 (2003)

    Article  Google Scholar 

  18. S. Grosdidier, J. Fernandez-Recio, Identification of hot-spot residues in protein-protein interactions by computational docking. BMC Bioinform. 9, 447 (2008)

    Article  Google Scholar 

  19. R. Guerois, J.E. Nielsen, L. Serrano, Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J. Mol. Biol. 320(2), 369–387 (2002)

    Article  Google Scholar 

  20. I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)

    MATH  Google Scholar 

  21. I. Halperin, H. Wolfson, R. Nussinov, Protein-protein interactions; coupling of structurally conserved residues and of hot spots across interfaces. Implications for docking. Structure (London, England : 1993) 12(6), 1027–1038 (2004)

    Google Scholar 

  22. S. Jones, J.M. Thornton, Analysis of protein-protein interaction sites using surface patches. J. Mol. Biol. 272(1), 121–132 (1997)

    Article  Google Scholar 

  23. L. Kelly, H. Fukushima, R. Karchin, J.M. Gow, L.W. Chinn, U. Pieper, M.R. Segal, D.L. Kroetz, A. Sali, Functional hot spots in human ATP-binding cassette transporter nucleotide binding domains. Protein Sci. 19(11), 2110–2121 (2010)

    Article  Google Scholar 

  24. O. Keskin, B.Y. Ma, R. Nussinov, Hot regions in protein-protein interactions: the organization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345(5), 1281–1294 (2005)

    Article  Google Scholar 

  25. D. Kim, A feature-based approach to modeling protein-protein interaction hot spots. Nucleic Acids Res. 37(8), 2672–2687 (2009)

    Article  Google Scholar 

  26. N. Koga, R. Tatsumi-Koga, G. Liu, R. Xiao, T.B. Acton, G.T. Montelione, D. Baker, Principles for designing ideal protein structures. Nature 491(7423), 222–227 (2012)

    Article  Google Scholar 

  27. R. Kohavi, G.H. John, Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    Article  MATH  Google Scholar 

  28. T.T. Kortemme, D.D. Baker, A simple physical model for binding energy hot spots in protein-protein complexes. Proc. Natl. Acad. Sci. U. S. A. 99(22), 14116–14121 (2002)

    Article  Google Scholar 

  29. D.M. Krüger, H. Gohlke, DrugScorePPI webserver: fast and accurate in silico alanine scanning for scoring protein-protein interactions. Nucleic Acids Res. 38(Web Server issue), W480–W486 (2010)

    Google Scholar 

  30. B. Kuhlman, G. Dantas, G.C. Ireton, G. Varani, B.L. Stoddard, D. Baker, Design of a novel globular protein fold with atomic-level accuracy. Science 302(5649), 1364–1368 (2003)

    Article  Google Scholar 

  31. M.C. Lawrence, P.M. Colman, Shape complementarity at protein/protein interfaces. J. Mol. Biol. 234(4), 946–950 (1993)

    Article  Google Scholar 

  32. A. Leaver-Fay, M. Tyka, S.M. Lewis, O.F. Lange, J. Thompson, R. Jacak, K. Kaufman, P.D. Renfrew, C.A. Smith, W. Sheffler, I.W. Davis, S. Cooper, A. Treuille, D.J. Mandell, F. Richter, Y.E.A. Ban, S.J. Fleishman, J.E. Corn, D.E. Kim, S. Lyskov, M. Berrondo, S. Mentzer, Z. Popović, J.J. Havranek, J. Karanicolas, R. Das, J. Meiler, T. Kortemme, J.J. Gray, B. Kuhlman, D. Baker, P. Bradley, Rosetta3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011)

    Google Scholar 

  33. O. Lichtarge, H.R. Bourne, F.E. Cohen, An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 257(2), 342–358 (1996)

    Article  Google Scholar 

  34. S. Lise, C. Archambeau, M. Pontil, D.T. Jones, Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods. BMC Bioinform. 10, 365 (2009)

    Article  Google Scholar 

  35. Q. Liu, J. Li, Protein binding hot spots and the residue-residue pairing preference: a water exclusion perspective. BMC Bioinform. 11, 244 (2010)

    Article  Google Scholar 

  36. N.A.G. Meenan, A. Sharma, S.J. Fleishman, C.J. Macdonald, B. Morel, R. Boetzel, G.R. Moore, D. Baker, C. Kleanthous, The structural and energetic basis for high selectivity in a high-affinity protein-protein interaction. Proc. Natl. Acad. Sci. U. S. A. 107(22), 10080–10085 (2010)

    Article  Google Scholar 

  37. R. Metternich, G. Tarzia, “Hot spots” in medicinal chemistry. ChemMedChem 5(8), 1159–1162 (2010)

    Article  Google Scholar 

  38. I.H. Moal, J. Fernández-Recio, SKEMPI: a Structural Kinetic and Energetic database of Mutant Protein Interactions and its use in empirical models. Bioinformatics 28(20), 2600–2607 (2012)

    Article  Google Scholar 

  39. J. Nayak, B. Naik, H. Behera, A comprehensive survey on support vector machine in data mining tasks: applications & challenges. Int. J. Database Theory Appl. 8(1), 169–186 (2015)

    Article  Google Scholar 

  40. Y. Ofran, B. Rost, Protein-protein interaction hotspots carved into sequences. PLoS Comput. Biol. 3(7), e119 (2007)

    Google Scholar 

  41. S. Ovchinnikov, H. Park, D.E. Kim, F. DiMaio, D. Baker, Protein structure prediction using Rosetta in CASP12. Proteins: Struct. Funct. Bioinform. 86, 113–116 (2017)

    Article  Google Scholar 

  42. S.E.A. Ozbabacan, A. Gursoy, O. Keskin, R. Nussinov, Conformational ensembles, signal transduction and residue hot spots: application to drug discovery. Curr. Opin. Drug Discov. Dev. 13(5), 527–537 (2010)

    Google Scholar 

  43. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  44. D.M. Powers, Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. Int. J. Mach. Learn. Technol. 2(1), 37–63 (2011)

    Article  MathSciNet  Google Scholar 

  45. V. Pulim, B. Berger, J. Bienkowska, Optimal contact map alignment of protein-protein interfaces. Bioinformatics 24(20), 2324–2328 (2008)

    Article  Google Scholar 

  46. D. Rajamani, S. Thiel, S. Vajda, C.J. Camacho, Anchor residues in protein-protein interactions. Proc. Natl. Acad. Sci. U. S. A. 101(31), 11287–11292 (2004)

    Article  Google Scholar 

  47. I. Res, O. Lichtarge, Character and evolution of protein-protein interfaces. Phys. Biol. 2(2), S36–S43 (2005)

    Article  Google Scholar 

  48. J. Segura, N. Fernandez-Fuentes, PCRPi-DB: a database of computationally annotated hot spots in protein interfaces. Nucleic Acids Res. 39(Database issue), D755–60 (2011)

    Article  Google Scholar 

  49. J. Segura Mora, S.A. Assi, N. Fernandez-Fuentes, Presaging critical residues in protein interfaces-web server (PCRPi-W): a web server to chart hot spots in protein interfaces. PLoS One 5(8), e12352 (2010)

    Google Scholar 

  50. A. Shulman-Peleg, M. Shatsky, R. Nussinov, H.J. Wolfson, Spatial chemical conservation of hot spot interactions in protein-protein complexes. BMC Biol. 5, 43 (2007)

    Article  Google Scholar 

  51. A. Shulman-Peleg, M. Shatsky, R. Nussinov, H.J. Wolfson, MultiBind and MAPPIS: webservers for multiple alignment of protein 3D-binding sites and their interactions. Nucleic Acids Res. 36(Web Server issue), W260–W264 (2008)

    Google Scholar 

  52. K. Tharakaraman, L.N. Robinson, A. Hatas, Y.L. Chen, L. Siyue, S. Raguram, V. Sasisekharan, G.N. Wogan, R. Sasisekharan, Redesign of a cross-reactive antibody to dengue virus with broad-spectrum activity and increased in vivo potency. Proc. Natl. Acad. Sci. U.S.A. 110(17), E1555–E1564 (2013)

    Article  Google Scholar 

  53. N. Tuncbag, A. Gursoy, O. Keskin, Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. Bioinformatics 25(12), 1513–1520 (2009)

    Article  Google Scholar 

  54. N. Tuncbag, O. Keskin, A. Gursoy, HotPoint: hot spot prediction server for protein interfaces. Nucleic Acids Res. 38(Web Server issue), W402–W406 (2010)

    Google Scholar 

  55. M. Ui, Y. Tanaka, T. Tsumuraya, I. Fujii, M. Inoue, M. Hirama, Structural and energetic hot-spots for the interaction between a ladder-like polycyclic ether and the anti-ciguatoxin antibody 10C9Fab. Mol. Biosyst. 7, 793–798 (2010)

    Article  Google Scholar 

  56. J.M. Ward, N.M. Gorenstein, J. Tian, S.F. Martin, C.B. Post, Constraining binding hot spots: NMR and molecular dynamics simulations provide a structural explanation for enthalpy-entropy compensation in SH2-ligand binding. J. Am. Chem. Soc. 132(32), 11058–11070 (2010)

    Article  Google Scholar 

  57. J.F. Xia, X.M. Zhao, J. Song, D.S. Huang, APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinform. 11, 174 (2010)

    Article  Google Scholar 

  58. L. Yu, H. Liu, Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5(Oct), 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  59. X. Zhu, J.C. Mitchell, KFC2: a knowledge-based hot spot prediction method based on interface solvation, atomic density, and plasticity features. Proteins Struct. Funct. Bioinform. 79(9), 1097–0134 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

The feature table and feature selection code are available by email to the corresponding author. We thank the Association for Women in Mathematics (AWM) and the Brown University Institute for Computational and Experimental Research in Mathematics (ICERM) for hosting the Women in Data Science and Mathematics (WiSDM) workshop. The Brown University Center for Computation and Visualization (CCV) and the Institute for Protein Design at the University of Washington provided computational resources used for this project. Participation by JM was sponsored by the National Science Foundation [NSF DMS 1160360]. The AWM Advance Program supported participation by FS, AL, YC, TW, and HC. Participation by TW was also supported by DIMACS. FS is generously funded by the Washington Research Foundation Institute for Protein Design Postdoctoral Innovation Fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julie C. Mitchell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 The Author(s) and the Association for Women in Mathematics

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Seeger, F., Little, A., Chen, Y., Woolf, T., Cheng, H., Mitchell, J.C. (2019). Feature Design for Protein Interface Hotspots Using KFC2 and Rosetta. In: Gasparovic, E., Domeniconi, C. (eds) Research in Data Science. Association for Women in Mathematics Series, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-030-11566-1_8

Download citation

Publish with us

Policies and ethics