Graph Spectral Approach for Identifying Protein Domains

  • Hari Krishna Yalamanchili
  • Nita Parekh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5462)


Here we present a simple method based on graph spectral properties to automatically partition multi-domain proteins into individual domains. The identification of structural domains in proteins is based on the assumption that the interactions between the amino acids are higher within a domain than across the domains. These interactions and the topological details of protein structures can be effectively captured by the protein contact graph, constructed by considering each amino acid as a node with an edge drawn between two nodes if the C α atoms of the amino acids are within 7Å. Here we show that Newman’s community detection approach in social networks can be used to identify domains in protein structures. We have implemented this approach on protein contact networks and analyze the eigenvectors of the largest eigenvalue of modularity matrix, which is a modified form of the Adjacency matrix, using a quality function called “modularity” to identify optimal divisions of the network into domains. The proposed approach works even when the domains are formed with amino acids not occurring sequentially along the polypeptide chain and no a priori information regarding the number of nodes is required.


Domain prediction Protein contact networks Graph spectral analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Janin, J., Wodak, S.J.: Structural domains in proteins and their role in the dynamics of protein function. Prog. Biophys. Mol. Biol. 42, 21–78 (1983)CrossRefPubMedGoogle Scholar
  2. 2.
    Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Swindells, M.B.: A procedure for the automatic determination of hydrophobic cores in protein structures. Protein Sci. 4, 93–102 (1995)CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Holm, L., Sander, C.: The FSSP database of structurally aligned protein fold families. Nucl. Acid Res. 22, 3600–3609 (1994)Google Scholar
  5. 5.
    Siddiqui, A.S., Barton, G.J.: Continous and dicontinous domains, an algorithm for the automatic generation of reliable protein domain definitions. Protein Sci. 4, 872–884 (1995)CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Xu, Y., Xu, D., Gabow, H.: Protein domain decomposition using a graph-theoretic approach. Bioinformatics 16, 1091–1104 (2000)CrossRefPubMedGoogle Scholar
  7. 7.
    Ramesh, K., Sistla, B.K.V., Vishveshwara, S.: Identification of Domains and Domain Interface Residues in Multidomain Proteins From Graph Spectral Method. Structure, Function, and Bioinformatics 59, 616–626 (2005)CrossRefGoogle Scholar
  8. 8.
    Jones, S., Stewart, M., Michie, A., Swindells, M.B., Orengo, C., Thorton, J.M.: Domain assignment for protein structures using a consensus approach, characterization and analysis. Protein Science 7, 233–242 (1998)CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Conte, L.L., Ailey, B., Hubbard, T.J.P., Brenner, S.E., Murzin, A.G., Chothia, C.: SCOP: a structural classification of protein database. Nucleic Acid Res. 28, 257–259 (2000)CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Veretnik, S., Bourne, P.E., Alexandrov, N.N., Shindyalov, I.N.: Toward consistent assignment of structural domains in proteins. J. Mol. Biol. 339, 647–678 (2004)CrossRefPubMedGoogle Scholar
  11. 11.
    Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. 74 (2006), id. 036104Google Scholar
  12. 12.
    Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. USA. 103, 8577–8582 (2006)CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. 70 (2004), id. 066111Google Scholar
  14. 14.
    Holm, L., Sander, C.: Parser for protein folding units. Proteins 19, 256–268 (1994)CrossRefPubMedGoogle Scholar
  15. 15.
    Ford, L.R., Fulkerson, D.R.: Flows in Networks. Princeton University Press, Princeton (1962)Google Scholar
  16. 16.
    Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Systems. 1695 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Hari Krishna Yalamanchili
    • 1
  • Nita Parekh
    • 1
  1. 1.Center for Computational Natural Science and BioinformaticsInternational Institute of Information Technology, GachibowliHyderabadIndia

Personalised recommendations