Graph Spectral Approach for Identifying Protein Domains
Here we present a simple method based on graph spectral properties to automatically partition multi-domain proteins into individual domains. The identification of structural domains in proteins is based on the assumption that the interactions between the amino acids are higher within a domain than across the domains. These interactions and the topological details of protein structures can be effectively captured by the protein contact graph, constructed by considering each amino acid as a node with an edge drawn between two nodes if the C α atoms of the amino acids are within 7Å. Here we show that Newman’s community detection approach in social networks can be used to identify domains in protein structures. We have implemented this approach on protein contact networks and analyze the eigenvectors of the largest eigenvalue of modularity matrix, which is a modified form of the Adjacency matrix, using a quality function called “modularity” to identify optimal divisions of the network into domains. The proposed approach works even when the domains are formed with amino acids not occurring sequentially along the polypeptide chain and no a priori information regarding the number of nodes is required.
KeywordsDomain prediction Protein contact networks Graph spectral analysis
Unable to display preview. Download preview PDF.
- 4.Holm, L., Sander, C.: The FSSP database of structurally aligned protein fold families. Nucl. Acid Res. 22, 3600–3609 (1994)Google Scholar
- 11.Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. 74 (2006), id. 036104Google Scholar
- 13.Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. 70 (2004), id. 066111Google Scholar
- 15.Ford, L.R., Fulkerson, D.R.: Flows in Networks. Princeton University Press, Princeton (1962)Google Scholar
- 16.Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Systems. 1695 (2006)Google Scholar