Predicting Proteins Functional Family: A Graph-Based Similarity Derived from Community Detection

Mallek, Sabrine; Boukhris, Imen; Elouedi, Zied

doi:10.1007/978-3-319-11310-4_54

Sabrine Mallek¹²,
Imen Boukhris¹² &
Zied Elouedi¹²

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 323))

3968 Accesses
1 Citations

Abstract

This paper contributes to the problem of assessing similarities between node-labeled and edge-weighted graphs. Graph comparison is usually based on the maximum common subgraph (mcs) measure. The latter is an overly stringent measure which is sensitive toward small deviations and errors. In order to overcome these issues, we propose a relaxation of the mcs measure based on so-called communities. A community is used as an “almost common” subgraph with high concentrations of edges. With our approach, we increase tolerance towards noise and structural variation especially in the case of biological data. The proposed measure is validated by an experimental study conducted in the context of the analysis of the similarities among protein families based on the properties of their active sites.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000)
Article Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of community hierarchies in large networks. CoRR abs/0803.0476 (2008)
Google Scholar
Boukhris, I., Elouedi, Z., Fober, T., Mernberger, M., Hüllermeier, E.: Similarity analysis of protein binding sites: A generalization of the maximum common subgraph measure based on quasi-clique detection. In: International Conference on Intelligent Systems Design and Applications, ISDA, pp. 1245–1250 (2009)
Google Scholar
Bunke, H.: Recent developments in graph matching. In: ICPR, pp. 2117–2124 (2000)
Google Scholar
Ferrer, M., Valvenya, E., Serratosa, F.: Median graph: A new exact algorithm using a distance based on the maximum common subgraph. Pattern Recognition Letters 30, 579–588 (2009)
Article Google Scholar
Fober, T., Mernberger, M., Moritz, R., Hullermeier, E.: Graph-kernels for the comparative analysis of protein active sites. GI 157, 21–31 (2009)
Google Scholar
Fober, T., Klebe, G., Hüllermeier, E.: Local clique merging: An extension of the maximum common subgraph measure with applications in structural bioinformatics. In: Algorithms from and for Nature and Life. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 279–286. Springer (2013)
Google Scholar
Fortunato, S.: Community detection in graphs. Physics Reports 486, 75–174 (2010)
Article MathSciNet Google Scholar
Gardiner, E.J., Artymiuk, P.J., Willett, P.: Clique-detection algorithms for matching three-dimensional molecular structures. Journal of Molecular Graphics and Modelling 15(4), 245–253 (1997)
Article Google Scholar
Klebe, G., Hulemeirer, E., Weskamp, N., Khun, D.: Functional classification of protein kinase binding sites using cavbase. ChemMedChem. 2, 1432–1447 (2007)
Article Google Scholar
Krishnamurthy, B., Wang, J.: On network-aware clustering of web clients. SIGCOMM Comput. Commun. Rev. 30(4), 97–110 (2000)
Article Google Scholar
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review 69, 026113 (2004)
Google Scholar
Nisius, B., Sha, F., Gohlke, H.: Structure-based computational analysis of protein binding sites for function and druggability prediction. Journal of Biotechnology 159(3), 123–134 (2012)
Article Google Scholar
Raymond, J.W., Gardiner, E.J., Willett, P.: Heuristics for similarity searching of chemical graphs using a maximum common edge subgraph algorithm. Journal of Chemical Information and Computer Sciences 42(2), 305–316 (2002)
Google Scholar
Raymond, J.W., Willett, P.: Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of Computer-Aided Molecular Design 16, 521–533 (2002)
Article Google Scholar
Rives, A.W., Galitski, T.: Modular organization of cellular networks. Proc. Natl. Acad. 100, 1128–1133 (2003)
Article Google Scholar
Schmitt, S., Kuhn, D., Klebe, G.: A new method to detect related function among proteins independent of sequence and fold homology. J. Mol. Biol. 323, 387–406 (2002)
Article Google Scholar
Tsourakakis, C., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.: Denser than the densest subgraph: Extracting optimal quasi-cliques with quality guarantees. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, pp. 104–112. ACM, New York (2013)
Google Scholar
Weskamp, N., Hullermeier, E., Kuhn, D., Klebe, G.: Multiple graph alignment for the structural analysis of protein active sites. IEEE/ACM Trans. Comput. Biol. Bioinformatics 4(2), 310–320 (2007)
Article Google Scholar
Weskamp, N., Kuhn, D., Hullermeier, E., Klebe, G.: Efficient similarity search in protein structure databases by k-clique hashing. Bioinformatics 20(10), 1522–1526 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

LARODEC, Institut Supérieur de Gestion de Tunis, Université de Tunis, Tunis, Tunisia
Sabrine Mallek, Imen Boukhris & Zied Elouedi

Authors

Sabrine Mallek
View author publications
You can also search for this author in PubMed Google Scholar
Imen Boukhris
View author publications
You can also search for this author in PubMed Google Scholar
Zied Elouedi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sabrine Mallek .

Editor information

Editors and Affiliations

Ford Motor Company, Research & Advanced Engineering, Dearborn, Mississippi, USA
D. Filev
Industrial Research Institute for Automation and Measurements (PIAP), Warsaw, Poland
J. Jabłkowski
Polish Academy of Sciences, Systems Research Institute, Warsaw, Poland
J. Kacprzyk
Polish Academy of Sciences and WIT - Warsaw School of Information Technology, Systems Research Institute, Warsaw, Poland
M. Krawczak
Bulgarian Academy of Sciences, Institute of Information and Communication, Sofia, Bulgaria
I. Popchev
Department of Computer Engineering, Częstochowa University of Technology, Częstochowa, Poland
L. Rutkowski
Bulgarian Academy of Sciences, Institute of Information and Communication Technologies, Sofia, Bulgaria
V. Sgurev
Department of Computer and Information Technologies, “Prof. Assen Zlatarov" University Faculty of Technical Sciences, Bourgas, Bulgaria
E. Sotirova
Industrial Research Institute for Automation and Measurements (PIAP), Warsaw, Poland
P. Szynkarczyk
Polish Academy of Sciences, Systems Research Institute, Warsaw, Poland
S. Zadrozny

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mallek, S., Boukhris, I., Elouedi, Z. (2015). Predicting Proteins Functional Family: A Graph-Based Similarity Derived from Community Detection. In: Filev, D., et al. Intelligent Systems'2014. Advances in Intelligent Systems and Computing, vol 323. Springer, Cham. https://doi.org/10.1007/978-3-319-11310-4_54

Download citation

DOI: https://doi.org/10.1007/978-3-319-11310-4_54
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11309-8
Online ISBN: 978-3-319-11310-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics