Skip to main content

A Fast Algorithm for Large Common Connected Induced Subgraphs

  • Conference paper
  • First Online:
Algorithms for Computational Biology (AlCoB 2017)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10252))

Included in the following conference series:

Abstract

We present a fast algorithm for finding large common subgraphs, which can be exploited for detecting structural and functional relationships between biological macromolecules. Many fast algorithms exist for finding a single maximum common subgraph. We show with an example that this gives limited information, motivating the less studied problem of finding many large common subgraphs covering different areas. As the latter is also hard, we give heuristics that improve performance by several orders of magnitude. As a case study, we validate our findings experimentally on protein graphs with thousands of atoms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Artymiuk, P., Poirrette, A., Grindley, H., Rice, D., Willett, P.: A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. J Mol. Biol. 243(2), 327–344 (1994)

    Article  Google Scholar 

  2. Artymiuk, P., Spriggs, R., Willett, P.: Graph theoretic methods for the analysis of structural relationships in biological macromolecules. J. AM. Soc. Inf. Sci. Technol. 56(5), 518–528 (2005)

    Article  Google Scholar 

  3. Avis, D., Fukuda, K.: Reverse search for enumeration. Discrete Appl. Math. 65(1), 21–46 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bonchev, D.: Chemical Graph Theory: Introduction and Fundamentals. CRC Press, Boca Raton (1991)

    MATH  Google Scholar 

  5. Brint, A., Willett, P.: Algorithms for the identification of three-dimensional maximal common substructures. J. Chem. Inf. Comput. Sci. 27(4), 152–158 (1987)

    Article  Google Scholar 

  6. Bron, C., Kerbosch, J.: Finding all cliques of an undirected graph (algorithm 457). Commun. ACM 16(9), 575–576 (1973)

    Article  MATH  Google Scholar 

  7. Brun, L., Gaüzère, B., Fourey, S.: Relationships between graph edit distance and maximal common unlabeled subgraph. Technical report, HAL Id: hal-00714879, July 2012

    Google Scholar 

  8. Cao, Y., Charisi, A., Cheng, L., Jiang, T., Girke, T.: ChemmineR: a compound mining framework for R. Bioinformatics 24(15), 1733–1734 (2008)

    Article  Google Scholar 

  9. Cao, Y., Jiang, T., Girke, T.: A maximum common substructure-based algorithm for searching and predicting drug-like compounds. Bioinformatics 24(13), i366–i374 (2008)

    Article  Google Scholar 

  10. Carraghan, R., Pardalos, P.: An exact algorithm for the maximum clique problem. Oper. Res. Lett. 9(6), 375–382 (1990)

    Article  MATH  Google Scholar 

  11. Conte, A., Grossi, R., Marino, A., Versari, L.: Sublinear-space bounded-delay enumeration for massive network analytics: maximal cliques. In: ICALP (2016)

    Google Scholar 

  12. Conte, D., Foggia, P., Vento, M.: Challenging complexity of maximum common subgraph detection algorithms: a performance analysis of three algorithms on a wide database of graphs. J. Graph Algorithms Appl. 11(1), 99–143 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  13. Holder, L.: PDB-to-graph program (2015). https://github.com/mikeizbicki/datasets/tree/master/graph/pdb2graph. Accessed 04 May 2016

  14. Huan, J., Wang, W., Prins, J., Yang, J.: Spin: mining maximal frequent subgraphs from graph databases. In: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 581–586. ACM (2004)

    Google Scholar 

  15. Kann, V.: On the approximability of the maximum common subgraph problem. In: Finkel, A., Jantzen, M. (eds.) STACS 1992. LNCS, vol. 577, pp. 375–388. Springer, Heidelberg (1992). doi:10.1007/3-540-55210-3_198

    Chapter  Google Scholar 

  16. Koch, I.: Enumerating all connected maximal common subgraphs in two graphs. Theor. Comput. Sci. 250(1), 1–30 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  17. Koch, I., Lengauer, T., Wanke, E.: An algorithm for finding maximal common subtopologies in a set of protein structures. J. Comput. Biol. 3(2), 289–306 (1996)

    Article  Google Scholar 

  18. Krissinel, E., Henrick, K.: Common subgraph isomorphism detection by backtracking search. Softw.: Pract. Experience 34(6), 591–607 (2004)

    Google Scholar 

  19. Levi, G.: A note on the derivation of maximal common subgraphs of two directed or undirected graphs. CALCOLO 9(4), 341–352 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  20. Mcgregor, J.: Backtrack search algorithm and the maximal common subgraph problem. Softw. Pract. Experience 12, 23–34 (1982)

    Article  MATH  Google Scholar 

  21. Raymond, J., Gardiner, E., Willett, P.: Rascal: calculation of graph similarity using maximum common edge subgraphs. Comput. J. 45, 2002 (2002)

    Article  MATH  Google Scholar 

  22. Sheridan, R., Kearsley, S.: Why do we need so many chemical similarity search methods? Drug Discov. Today 7(17), 903–911 (2002)

    Article  Google Scholar 

  23. Suters, W.H., Abu-Khzam, F.N., Zhang, Y., Symons, C.T., Samatova, N.F., Langston, M.A.: A new approach and faster exact methods for the maximum common subgraph problem. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 717–727. Springer, Heidelberg (2005). doi:10.1007/11533719_73

    Chapter  Google Scholar 

  24. Ullmann, J.: An algorithm for subgraph isomorphism. J. ACM 23(1), 31–42 (1976)

    Article  MathSciNet  Google Scholar 

  25. Van Berlo, R., Winterbach, W., De Groot, M., Bender, A., Verheijen, P., Reinders, M., de Ridder, D.: Efficient calculation of compound similarity based on maximum common subgraphs and its application to prediction of gene transcript levels. Int. J. Bioinform. Res. Appl. 9(4), 407–432 (2013)

    Article  Google Scholar 

  26. Versari, L.: Ricerca veloce di pattern comuni a due grafi. Master’s thesis, University of Pisa, Pisa, Bachelor Thesis (in Italian), University of Pisa (2015)

    Google Scholar 

  27. Wang, T., Zhou, J.: EMCSS: a new method for maximal common substructure search. J. Chem. Inf. Comput. Sci. 37(5), 828–834 (1997)

    Article  Google Scholar 

  28. Welling, R.: A performance analysis on maximal common subgraph algorithms. In: 15th Twente Student Conference on IT, University of Twente, The Netherlands (2011)

    Google Scholar 

Download references

Acknowledgments

Work partially supported by projects MIUR PRIN 2012C4E3KT (all authors except LT, LV) and UNIPI PRA_2015_0058 (authors RG, LT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Marino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Conte, A., Grossi, R., Marino, A., Tattini, L., Versari, L. (2017). A Fast Algorithm for Large Common Connected Induced Subgraphs. In: Figueiredo, D., Martín-Vide, C., Pratas, D., Vega-Rodríguez, M. (eds) Algorithms for Computational Biology. AlCoB 2017. Lecture Notes in Computer Science(), vol 10252. Springer, Cham. https://doi.org/10.1007/978-3-319-58163-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58163-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58162-0

  • Online ISBN: 978-3-319-58163-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics