Identification of Hotspots in Protein-Protein Interactions Based on Recursive Feature Elimination

  • Xiaoli Lin
  • Xiaolong Zhang
  • Fengli Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10954)


The study of protein-protein interactions and protein structure through computational methods is critical to understand protein function. Hot spot residues play an important role in bioinformatics to reveal life activities. However, conventional hot spots prediction methods may face great challenges. This paper proposes a hot spot prediction method based on feature selection method SVM-RFE to improve the training performance. SMOTE based oversampling is used to adds new samples to avoid an overfitting classifier. SVM-RFE is then invoked to obtained optimal feature subset. Finally, a feature-based SVM is created to predict the hot spots. Experimental results indicate that the performance of hot spots prediction has been significantly improved compared with the previous methods.


Protein-Protein interactions Hot spots SVM-RFE Classification 



The authors thank the members of Machine Learning and Artificial Intelligence Laboratory, School of Computer Science and Technology, Wuhan University of Science and Technology, for their helpful discussion within seminars. This work was supported in part by National Natural Science Foundation of China (No. 61502356, 61273225), by Hubei Province Natural Science Foundation of China (No. 2018CFB526).


  1. 1.
    Giot, L., Bader, J.S., Brouwer, C., Chaudhuri, A., Kuang, B.: A protein interaction map of Drosophila Melanogaster. Science 302, 1727–1736 (2003)CrossRefGoogle Scholar
  2. 2.
    Lin, X.L., Zhang, X.L., Zhou, F.L.: Protein structure prediction with local adjust tabu search algorithm. BMC Bioinform. 5(S15), S1 (2014)CrossRefGoogle Scholar
  3. 3.
    Sahu, S.S., Panda, G.: Efficient Localization of hot spots in proteins using a novel S-transform based filtering approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(5), 1235–1246 (2011)CrossRefGoogle Scholar
  4. 4.
    Keskin, O., Tuncbag, N., Gursoy, A.: Predicting protein-protein interactions from the molecular to the proteome level. Chem. Rev. 116(8), 4884–4909 (2016)CrossRefGoogle Scholar
  5. 5.
    Cho, K., Kim, D., Lee, D.: A feature-based approach to modeling protein-protein interaction hot spots. Nucl. Acids Res. 37(8), 2672–2687 (2009)CrossRefGoogle Scholar
  6. 6.
    Morrison, K.L., Weiss, G.A.: Combinatorial Alanine-scanning. Curr. Opin. Chem. Biol. 5(3), 302–307 (2001)CrossRefGoogle Scholar
  7. 7.
    Kortemme, T., Kim, D.E., Baker, D.: Computational Alanine scanning of protein-protein interfaces. Sci. STKE Signal Transduct. Knowl. Environ. (STKE) 2004(219), pl2 (2004)CrossRefGoogle Scholar
  8. 8.
    Bogan, A., Thorn, K.S.: Anatomy of ces. J. Mol. Biol. 280, 1–9 (1998)CrossRefGoogle Scholar
  9. 9.
    Kortemme, T., Baker, D.: A simple physical model for binding energy hot spots in protein-protein complexes. Proc. Nat. Acad. Sci. USA 99(22), 14116–14121 (2002)CrossRefGoogle Scholar
  10. 10.
    Ofran, Y., Rost, B.: ISIS: Interaction Sites Identified from Sequences. Bioinformatics 23, e13–e16 (2006)CrossRefGoogle Scholar
  11. 11.
    Darnell, S.J., Page, D., Mitchell, J.C.: An automated decision-tree approach to predicting protein interaction hot spots. Proteins 68(4), 813–823 (2007)CrossRefGoogle Scholar
  12. 12.
    Burgoyne, N.J., Jackson, R.M.: Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces. Bioinformatics 22(11), 1335–1342 (2006)CrossRefGoogle Scholar
  13. 13.
    Barata, T.S., Zhang, C., Dalby, P.A., Brocchini, S., Zloh, M.: Identification of protein-excipient interaction hotspots using computational approaches. Int. J. Mol. Sci. 17(6), 853 (2016)CrossRefGoogle Scholar
  14. 14.
    Tuncbag, N., Gursoy, A., Keskin, O.: Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. Bioinformatics 25(12), 1513–1520 (2009)CrossRefGoogle Scholar
  15. 15.
    Keskin, O., Ma, B., Nussinov, R.: Hot regions in protein- protein interaction: the organization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345(5), 1281–1294 (2005)CrossRefGoogle Scholar
  16. 16.
    Xia, J.F., Zhao, X.M., Song, J., Huang, D.S.: APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinform. 11, 174 (2010)CrossRefGoogle Scholar
  17. 17.
    Thorn, K.S., Bogan, A.A.: ASEdb: a data base of Alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17(3), 284–285 (2001)CrossRefGoogle Scholar
  18. 18.
    Moal, I.H., Fernández-Recio, J.: SKEMPI: a Structural Kinetic and Energetic database of Mutant Protein Interactions and its use in empirical models. Bioinformatics 28(20), 2600–2607 (2012)CrossRefGoogle Scholar
  19. 19.
    Fischer, T.B., Arunachalam, K.V., Bailey, D., Mangual: The Binding Interface Database (BID): a compilation of amino acid hot spots in protein interfaces. Bioinformatics 19(11), 1453–1454 (2003)CrossRefGoogle Scholar
  20. 20.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)zbMATHGoogle Scholar
  21. 21.
    Bermingham, M.L., Pongwong, R., Spiliopoulou, A., et al.: Application of high-dimensional feature selection: evaluation for genomic prediction in man. Sci. Rep. 5, 10312 (2015)CrossRefGoogle Scholar
  22. 22.
    Mihel, J., Sikic, M., Tomic, S., Jeren, B., Vlahovicek, K.: PSAIA-protein structure and interaction analyzer. BMC Struct. Biol. 8(1), 21 (2008)CrossRefGoogle Scholar
  23. 23.
    Ofran, Y., Ros, B.: ISIS: Interaction Sites Identified from Sequence. Bioinformatics 23(2), e13–e16 (2007)CrossRefGoogle Scholar
  24. 24.
    Guerois, R., Nielsen, J.E., Serrano, L.: Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J. Mol. Biol. 320(2), 369–387 (2002)CrossRefGoogle Scholar
  25. 25.
    Darnell, S.J., Legault, L., Mitchell, J.C.: KFC server: interactive forecasting of protein interaction hot spots. Nucl. Acids Res. 36(suppl 2), W265–W269 (2008)CrossRefGoogle Scholar
  26. 26.
    Cho, K., Kim, D., Lee, D.: A feature-based approach to modeling protein–protein interaction hot spots. Nucl. Acids Res 37(8), 2672–2687 (2009)CrossRefGoogle Scholar
  27. 27.
    Zhang, S.H., Zhang, X.L.: Prediction of hot spots at protein-protein interface. Acta Biophysica Sinica 29(2), 1–12 (2013)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Hubei Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, School of Computer Science and TechnologyWuhan University of Science and TechnologyWuhanChina
  2. 2.Information and Engineering Department of City CollegeWuhan University of Science and TechnologyWuhanChina

Personalised recommendations