Local Pre-processing for Node Classification in Networks

Application in Protein-Protein Interaction
  • Christopher E. Foley
  • Sana Al Azwari
  • Mark Dufton
  • Isla Ross
  • John N. Wilson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8060)


Network modelling provides an increasingly popular conceptualisation in a wide range of domains, including the analysis of protein structure. Typical approaches to analysis model parameter values at nodes within the network. The spherical locality around a node provides a microenvironment that can be used to characterise an area of a network rather than a particular point within it. Microenvironments that centre on the nodes in a protein chain can be used to quantify parameters that are related to protein functionality. They also permit particular patterns of such parameters in node-centred microenvironments to be used to locate sites of particular interest. This paper evaluates an approach to index generation that seeks to rapidly construct microenvironment data. The results show that index generation performs best when the radius of microenvironments matches the granularity of the index. Results are presented to show that such microenvironments improve the utility of protein chain parameters in classifying the structural characteristics of nodes using both support vector machines and neural networks.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aggarwal, C.: Social Network Data Analytics. Springer (2011)Google Scholar
  2. 2.
    Ahlswede, R., Cai, N.C.N., Li, S.Y.R., Yeung, R.W.: Network information flow (2000)Google Scholar
  3. 3.
    Amitai, G., Shemesh, A., Sitbon, E., Shklar, M., Netanely, D., Venger, I., Pietrokovski, S.: Network analysis of protein structures identifies functional residues. J. Mol. Biol. 344(4), 1135–1146 (2004)CrossRefGoogle Scholar
  4. 4.
    Ansari, S., Helms, V.: Statistical analysis of predominantly transient protein-protein interfaces. Proteins 61, 344–355 (2005)CrossRefGoogle Scholar
  5. 5.
    Aurenhammer, F.: Voronoi diagrams-a survey of a fundamental geometric data structure. ACM Comput. Surv. 23, 345–405 (1991)CrossRefGoogle Scholar
  6. 6.
    Bagley, S., Altman, R.: Characterizing the microenvironment surrounding protein sites. Protein Science 4, 622–635 (1995)CrossRefGoogle Scholar
  7. 7.
    Bentley, J., Stanat, D., Hollins Williams, E.: The complexity of finding fixed-radius near neighbors. Information Processing Letters 6(6), 209–212 (1977)CrossRefMATHGoogle Scholar
  8. 8.
    Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18, 509–517 (1975)CrossRefMATHGoogle Scholar
  9. 9.
    Berman, H.M., et al.: The Protein Data Bank. Acta Crystallogr. D 58(6, pt. 1), 899–907 (2002)CrossRefGoogle Scholar
  10. 10.
    Bisbal, J., Engelbrecht, G., Villa-Uriol, M.-C., Frangi, A.F.: Prediction of cerebral aneurysm rupture using hemodynamic, morphologic and clinical features: A data mining approach. In: Hameurlain, A., Liddle, S.W., Schewe, K.-D., Zhou, X. (eds.) DEXA 2011, Part II. LNCS, vol. 6861, pp. 59–73. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    Burgoyne, N.J., Jackson, R.M.: Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces. Bioinformatics 22(11), 1335–1342 (2006)CrossRefGoogle Scholar
  12. 12.
    Foley, C.E., AlAzwari, S., Dufton, M., Wilson, J.N.: Using microenvironments to identify allosteric binding sites. In: Proc. IEEE International Conference on Bioinformatics and Biomedicine, pp. 1–5 (2012)Google Scholar
  13. 13.
    Chang, C., Lin, C.: LIBSVM: A library for support vector machines. ACM TOIST 2, 27:1–27:27 (2011)Google Scholar
  14. 14.
    Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20, 273–297 (1995)MATHGoogle Scholar
  15. 15.
    Csermely, P.: Creative elements: network-based predictions of active centres in proteins and cellular and social networks. Trends in Biochemical Sciences 33(12), 569–576 (2008)CrossRefGoogle Scholar
  16. 16.
    Ezkurdia, I., Bartoli, L., Fariselli, P., Casadio, R., Valencia, A., Tress, M.L.: Progress and challenges in predicting protein-protein interaction sites. Briefings in Bioinformatics 10(3), 233–246 (2009)CrossRefGoogle Scholar
  17. 17.
    Fariselli, P., Pazos, F., Valencia, A., Casadio, R.: Prediction of protein-protein interaction sites in heterocomplexes with neural networks. European Journal of Biochemistry 269(5), 1356–1361 (2002)CrossRefGoogle Scholar
  18. 18.
    Farkas, I.J., Korcsmaros, T., Kovacs, I.A., Mihalik, A., Palotai, R., Simko, G.I., Szalay, K.Z., Szalay-Beko, M., Vellai, T., Wang, S., Csermely, P.: Network-Based Tools for the Identification of Novel Drug Targets. Sci. Signal. 4(173), pt. 3+ (2011)Google Scholar
  19. 19.
    Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)CrossRefGoogle Scholar
  20. 20.
    Kauffman, S.A.: Metabolic stability and epigenesis in randomly constructed genetic nets. Journal of Theoretical Biology 22(3), 437–467 (1969)CrossRefGoogle Scholar
  21. 21.
    Kirman, A.P.: The economy as an evolving network. J. Evolutionary Economics 7(4), 339–353 (1997)CrossRefGoogle Scholar
  22. 22.
    Klosowski, J., Held, M., Mitchell, J., Sowizral, H., Zikan, K.: Efficient collision detection using bounding volume hierarchies of k-dops. IEEE T. Vis. Comput. Gr. 4(1), 21–36 (1998)CrossRefGoogle Scholar
  23. 23.
    Levinthal, C.: Molecular model-building by computer. Scientific American 214, 42–52 (1966)CrossRefGoogle Scholar
  24. 24.
    Liu, R., Jiang, W., Zhou, Y.: Identifying protein-protein interaction sites in transient complexes with temperature factor, sequence profile & accessible surface area. Amino Acids 38(1), 263–270 (2010)CrossRefGoogle Scholar
  25. 25.
    MATLAB. version 7.13.0 (R2011b). The MathWorks Inc., Natick, Massachusetts (2011)Google Scholar
  26. 26.
    Shinji, S., Hiroki, S., Kobori, M., Noriaki, H.: Use of amino acid composition to predict ligand-binding sites. J. Chem. Inf. Model. 47, 400–406 (2007)CrossRefGoogle Scholar
  27. 27.
    Vishveshwara, S., Brinda, K., Kannan, N.: Protein structure: insights from graph theory. Journal of Theoretical and Computational Chemistry 1(1), 1–25 (2002)CrossRefGoogle Scholar
  28. 28.
    Wood, T., Shenoy, P., Venkataramani, A., Yousif, M.: Sandpiper: Black-box and gray-box resource management for virtual machines. Computer Networks 53(17), 2923–2938 (2009)CrossRefMATHGoogle Scholar
  29. 29.
    Wu, S., Liu, T., Altman, R.: Identification of recurring protein structure microenvironments and discovery of novel functional sites around cys residues. BMC Struct. Biol. 10(4) (2010)Google Scholar
  30. 30.
    Xia, J., Zhao, X., Song, J., Huang, D.: Apis: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. Bioinformatics 11(174), 1–14 (2010)Google Scholar
  31. 31.
    Gui, J., Yang, L., Xia, J.F.: Prediction of protein-protein interactions from protein sequence using local descriptors. Protein Pept. Lett. 17(9), 1085–1090 (2010)CrossRefGoogle Scholar
  32. 32.
    Yuan, Z., Bailey, T.L., Teasdale, R.D.: Prediction of protein B-factor profiles. Proteins: Struct., Funct., Bioinf. 58(4), 905–912 (2005)CrossRefGoogle Scholar
  33. 33.
    Zhang, G.: Neural networks for classification: A survey. IEEE Transactions on Systems, Man and Cybernetics - Part C 30(4), 451–462 (2000)CrossRefGoogle Scholar
  34. 34.
    Zhao, Y., Levina, E., Zhu, J.: Community extraction for social networks. Proc. National Academy of Sciences 108(18), 7321–7326 (2011)CrossRefGoogle Scholar
  35. 35.
    Zhong-Hua, S., Fan, J.: Prediction of protein binding sites using physical and chemical descriptors and the support vector machine regression method. Chinese Physics B 19(11), 110502 (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Christopher E. Foley
    • 1
    • 2
  • Sana Al Azwari
    • 1
  • Mark Dufton
    • 2
  • Isla Ross
    • 1
  • John N. Wilson
    • 1
  1. 1.Department of Computer & Information SciencesUniversity of StrathclydeGlasgowUK
  2. 2.Department of Pure & Applied ChemistryUniversity of StrathclydeGlasgowUK

Personalised recommendations