Abstract
This paper contributes to the problem of assessing similarities between node-labeled and edge-weighted graphs. Graph comparison is usually based on the maximum common subgraph (mcs) measure. The latter is an overly stringent measure which is sensitive toward small deviations and errors. In order to overcome these issues, we propose a relaxation of the mcs measure based on so-called communities. A community is used as an “almost common” subgraph with high concentrations of edges. With our approach, we increase tolerance towards noise and structural variation especially in the case of biological data. The proposed measure is validated by an experimental study conducted in the context of the analysis of the similarities among protein families based on the properties of their active sites.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of community hierarchies in large networks. CoRR abs/0803.0476 (2008)
Boukhris, I., Elouedi, Z., Fober, T., Mernberger, M., Hüllermeier, E.: Similarity analysis of protein binding sites: A generalization of the maximum common subgraph measure based on quasi-clique detection. In: International Conference on Intelligent Systems Design and Applications, ISDA, pp. 1245–1250 (2009)
Bunke, H.: Recent developments in graph matching. In: ICPR, pp. 2117–2124 (2000)
Ferrer, M., Valvenya, E., Serratosa, F.: Median graph: A new exact algorithm using a distance based on the maximum common subgraph. Pattern Recognition Letters 30, 579–588 (2009)
Fober, T., Mernberger, M., Moritz, R., Hullermeier, E.: Graph-kernels for the comparative analysis of protein active sites. GI 157, 21–31 (2009)
Fober, T., Klebe, G., Hüllermeier, E.: Local clique merging: An extension of the maximum common subgraph measure with applications in structural bioinformatics. In: Algorithms from and for Nature and Life. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 279–286. Springer (2013)
Fortunato, S.: Community detection in graphs. Physics Reports 486, 75–174 (2010)
Gardiner, E.J., Artymiuk, P.J., Willett, P.: Clique-detection algorithms for matching three-dimensional molecular structures. Journal of Molecular Graphics and Modelling 15(4), 245–253 (1997)
Klebe, G., Hulemeirer, E., Weskamp, N., Khun, D.: Functional classification of protein kinase binding sites using cavbase. ChemMedChem. 2, 1432–1447 (2007)
Krishnamurthy, B., Wang, J.: On network-aware clustering of web clients. SIGCOMM Comput. Commun. Rev. 30(4), 97–110 (2000)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review 69, 026113 (2004)
Nisius, B., Sha, F., Gohlke, H.: Structure-based computational analysis of protein binding sites for function and druggability prediction. Journal of Biotechnology 159(3), 123–134 (2012)
Raymond, J.W., Gardiner, E.J., Willett, P.: Heuristics for similarity searching of chemical graphs using a maximum common edge subgraph algorithm. Journal of Chemical Information and Computer Sciences 42(2), 305–316 (2002)
Raymond, J.W., Willett, P.: Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of Computer-Aided Molecular Design 16, 521–533 (2002)
Rives, A.W., Galitski, T.: Modular organization of cellular networks. Proc. Natl. Acad. 100, 1128–1133 (2003)
Schmitt, S., Kuhn, D., Klebe, G.: A new method to detect related function among proteins independent of sequence and fold homology. J. Mol. Biol. 323, 387–406 (2002)
Tsourakakis, C., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.: Denser than the densest subgraph: Extracting optimal quasi-cliques with quality guarantees. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, pp. 104–112. ACM, New York (2013)
Weskamp, N., Hullermeier, E., Kuhn, D., Klebe, G.: Multiple graph alignment for the structural analysis of protein active sites. IEEE/ACM Trans. Comput. Biol. Bioinformatics 4(2), 310–320 (2007)
Weskamp, N., Kuhn, D., Hullermeier, E., Klebe, G.: Efficient similarity search in protein structure databases by k-clique hashing. Bioinformatics 20(10), 1522–1526 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mallek, S., Boukhris, I., Elouedi, Z. (2015). Predicting Proteins Functional Family: A Graph-Based Similarity Derived from Community Detection. In: Filev, D., et al. Intelligent Systems'2014. Advances in Intelligent Systems and Computing, vol 323. Springer, Cham. https://doi.org/10.1007/978-3-319-11310-4_54
Download citation
DOI: https://doi.org/10.1007/978-3-319-11310-4_54
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11309-8
Online ISBN: 978-3-319-11310-4
eBook Packages: EngineeringEngineering (R0)