A Fast Algorithm for Large Common Connected Induced Subgraphs

Conte, Alessio; Grossi, Roberto; Marino, Andrea; Tattini, Lorenzo; Versari, Luca

doi:10.1007/978-3-319-58163-7_4

Alessio Conte¹⁷,
Roberto Grossi¹⁷,
Andrea Marino¹⁷,
Lorenzo Tattini¹⁸ &
…
Luca Versari¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10252))

Included in the following conference series:

International Conference on Algorithms for Computational Biology

540 Accesses
2 Citations

Abstract

We present a fast algorithm for finding large common subgraphs, which can be exploited for detecting structural and functional relationships between biological macromolecules. Many fast algorithms exist for finding a single maximum common subgraph. We show with an example that this gives limited information, motivating the less studied problem of finding many large common subgraphs covering different areas. As the latter is also hard, we give heuristics that improve performance by several orders of magnitude. As a case study, we validate our findings experimentally on protein graphs with thousands of atoms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Artymiuk, P., Poirrette, A., Grindley, H., Rice, D., Willett, P.: A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. J Mol. Biol. 243(2), 327–344 (1994)
Article Google Scholar
Artymiuk, P., Spriggs, R., Willett, P.: Graph theoretic methods for the analysis of structural relationships in biological macromolecules. J. AM. Soc. Inf. Sci. Technol. 56(5), 518–528 (2005)
Article Google Scholar
Avis, D., Fukuda, K.: Reverse search for enumeration. Discrete Appl. Math. 65(1), 21–46 (1996)
Article MathSciNet MATH Google Scholar
Bonchev, D.: Chemical Graph Theory: Introduction and Fundamentals. CRC Press, Boca Raton (1991)
MATH Google Scholar
Brint, A., Willett, P.: Algorithms for the identification of three-dimensional maximal common substructures. J. Chem. Inf. Comput. Sci. 27(4), 152–158 (1987)
Article Google Scholar
Bron, C., Kerbosch, J.: Finding all cliques of an undirected graph (algorithm 457). Commun. ACM 16(9), 575–576 (1973)
Article MATH Google Scholar
Brun, L., Gaüzère, B., Fourey, S.: Relationships between graph edit distance and maximal common unlabeled subgraph. Technical report, HAL Id: hal-00714879, July 2012
Google Scholar
Cao, Y., Charisi, A., Cheng, L., Jiang, T., Girke, T.: ChemmineR: a compound mining framework for R. Bioinformatics 24(15), 1733–1734 (2008)
Article Google Scholar
Cao, Y., Jiang, T., Girke, T.: A maximum common substructure-based algorithm for searching and predicting drug-like compounds. Bioinformatics 24(13), i366–i374 (2008)
Article Google Scholar
Carraghan, R., Pardalos, P.: An exact algorithm for the maximum clique problem. Oper. Res. Lett. 9(6), 375–382 (1990)
Article MATH Google Scholar
Conte, A., Grossi, R., Marino, A., Versari, L.: Sublinear-space bounded-delay enumeration for massive network analytics: maximal cliques. In: ICALP (2016)
Google Scholar
Conte, D., Foggia, P., Vento, M.: Challenging complexity of maximum common subgraph detection algorithms: a performance analysis of three algorithms on a wide database of graphs. J. Graph Algorithms Appl. 11(1), 99–143 (2007)
Article MathSciNet MATH Google Scholar
Holder, L.: PDB-to-graph program (2015). https://github.com/mikeizbicki/datasets/tree/master/graph/pdb2graph. Accessed 04 May 2016
Huan, J., Wang, W., Prins, J., Yang, J.: Spin: mining maximal frequent subgraphs from graph databases. In: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 581–586. ACM (2004)
Google Scholar
Kann, V.: On the approximability of the maximum common subgraph problem. In: Finkel, A., Jantzen, M. (eds.) STACS 1992. LNCS, vol. 577, pp. 375–388. Springer, Heidelberg (1992). doi:10.1007/3-540-55210-3_198
Chapter Google Scholar
Koch, I.: Enumerating all connected maximal common subgraphs in two graphs. Theor. Comput. Sci. 250(1), 1–30 (2001)
Article MathSciNet MATH Google Scholar
Koch, I., Lengauer, T., Wanke, E.: An algorithm for finding maximal common subtopologies in a set of protein structures. J. Comput. Biol. 3(2), 289–306 (1996)
Article Google Scholar
Krissinel, E., Henrick, K.: Common subgraph isomorphism detection by backtracking search. Softw.: Pract. Experience 34(6), 591–607 (2004)
Google Scholar
Levi, G.: A note on the derivation of maximal common subgraphs of two directed or undirected graphs. CALCOLO 9(4), 341–352 (1973)
Article MathSciNet MATH Google Scholar
Mcgregor, J.: Backtrack search algorithm and the maximal common subgraph problem. Softw. Pract. Experience 12, 23–34 (1982)
Article MATH Google Scholar
Raymond, J., Gardiner, E., Willett, P.: Rascal: calculation of graph similarity using maximum common edge subgraphs. Comput. J. 45, 2002 (2002)
Article MATH Google Scholar
Sheridan, R., Kearsley, S.: Why do we need so many chemical similarity search methods? Drug Discov. Today 7(17), 903–911 (2002)
Article Google Scholar
Suters, W.H., Abu-Khzam, F.N., Zhang, Y., Symons, C.T., Samatova, N.F., Langston, M.A.: A new approach and faster exact methods for the maximum common subgraph problem. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 717–727. Springer, Heidelberg (2005). doi:10.1007/11533719_73
Chapter Google Scholar
Ullmann, J.: An algorithm for subgraph isomorphism. J. ACM 23(1), 31–42 (1976)
Article MathSciNet Google Scholar
Van Berlo, R., Winterbach, W., De Groot, M., Bender, A., Verheijen, P., Reinders, M., de Ridder, D.: Efficient calculation of compound similarity based on maximum common subgraphs and its application to prediction of gene transcript levels. Int. J. Bioinform. Res. Appl. 9(4), 407–432 (2013)
Article Google Scholar
Versari, L.: Ricerca veloce di pattern comuni a due grafi. Master’s thesis, University of Pisa, Pisa, Bachelor Thesis (in Italian), University of Pisa (2015)
Google Scholar
Wang, T., Zhou, J.: EMCSS: a new method for maximal common substructure search. J. Chem. Inf. Comput. Sci. 37(5), 828–834 (1997)
Article Google Scholar
Welling, R.: A performance analysis on maximal common subgraph algorithms. In: 15th Twente Student Conference on IT, University of Twente, The Netherlands (2011)
Google Scholar

Download references

Acknowledgments

Work partially supported by projects MIUR PRIN 2012C4E3KT (all authors except LT, LV) and UNIPI PRA_2015_0058 (authors RG, LT).

Author information

Authors and Affiliations

Inria, Università di Pisa and Erable, Pisa, Italy
Alessio Conte, Roberto Grossi & Andrea Marino
IRCAN, CNRS UMR, 7284, Nice, France
Lorenzo Tattini
Scuola Normale Superiore, Pisa, Italy
Luca Versari

Authors

Alessio Conte
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Grossi
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Marino
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo Tattini
View author publications
You can also search for this author in PubMed Google Scholar
Luca Versari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrea Marino .

Editor information

Editors and Affiliations

University of Aveiro, Aveiro, Portugal
Daniel Figueiredo
Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
University of Aveiro, Aveiro, Portugal
Diogo Pratas
University of Extremadura, Caceres, Spain
Miguel A. Vega-Rodríguez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Conte, A., Grossi, R., Marino, A., Tattini, L., Versari, L. (2017). A Fast Algorithm for Large Common Connected Induced Subgraphs. In: Figueiredo, D., Martín-Vide, C., Pratas, D., Vega-Rodríguez, M. (eds) Algorithms for Computational Biology. AlCoB 2017. Lecture Notes in Computer Science(), vol 10252. Springer, Cham. https://doi.org/10.1007/978-3-319-58163-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-58163-7_4
Published: 25 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58162-0
Online ISBN: 978-3-319-58163-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics