Automatic Inference of Graph Transformation Rules Using the Cyclic Nature of Chemical Reactions

  • Christoph Flamm
  • Daniel MerkleEmail author
  • Peter F. Stadler
  • Uffe Thorsen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9761)


Graph transformation systems have the potential to be realistic models of chemistry, provided a comprehensive collection of reaction rules can be extracted from the body of chemical knowledge. A first key step for rule learning is the computation of atom-atom mappings, i.e., the atom-wise correspondence between products and educts of all published chemical reactions. This can be phrased as a maximum common edge subgraph problem with the constraint that transition states must have cyclic structure. We describe a search tree method well suited for small edit distance and an integer linear program best suited for general instances and demonstrate that it is feasible to compute atom-atom maps at large scales using a manually curated database of biochemical reactions as an example. In this context we address the network completion problem.


Chemistry Atom-atom mapping Maximum common edge subgraph Integer linear programming Network completion 



This work was supported in part by the Volkswagen Stiftung proj. no. I/82719, and the COST-Action CM1304 “Systems Chemistry” and by the Danish Council for Independent Research, Natural Sciences.

Supplementary material


  1. 1.
    Akutsu, T.: Efficient extraction of mapping rules of atoms from enzymatic reaction data. J. Comp. Biol. 11, 449–462 (2004)CrossRefGoogle Scholar
  2. 2.
    Akutsu, T., Tamura, T.: A polynomial-time algorithm for computing the maximum common connected edge subgraph of outerplanar graphs of bounded degree. Algorithms 6(1), 119 (2013)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Andersen, J.L., Flamm, C., Merkle, D., Stadler, P.F.: 50 shades of rule composition. In: Fages, F., Piazza, C. (eds.) FMMB 2014. LNCS, vol. 8738, pp. 117–135. Springer, Heidelberg (2014)Google Scholar
  4. 4.
    Bahiense, L., Mani, G., Piva, B., de Souza, C.C.: The maximum common edge subgraph problem: a polyhedral investigation. Discrete Appl. Math. 160(18), 2523–2541 (2012). v Latin American Algorithms, Graphs, and Optimization Symposium Gramado, Brazil, 2009MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Benkö, G., Flamm, C., Stadler, P.F.: A graph-based toy model of chemistry. J. Chem. Inf. Comput. Sci. 43, 1085–1093 (2003). presented at MCC 2002, Dubrovnik CRO, June 2002; SFI # 02–09-045CrossRefGoogle Scholar
  6. 6.
    Biggs, M.B., Papin, J.A.: Metabolic network-guided binning of metagenomic sequence fragments. Bioinformatics (2015)Google Scholar
  7. 7.
    Breitling, R., Vitkup, D., Barrett, M.P.: New surveyor tools for charting microbial metabolic maps. Nat. Rev. Microbiol. 6, 156–161 (2008)CrossRefGoogle Scholar
  8. 8.
    Burkard, R., ela, E., Pardalos, P., Pitsoulis, L.: The quadratic assignment problem. In: Du, D.Z., Pardalos, P. (eds.) Handbook of Combinatorial Optimization, pp. 1713–1809. Springer, US (1999)Google Scholar
  9. 9.
    Chen, W.L., Chen, D.Z., Taylor, K.T.: Automatic reaction mapping and reaction center detection. WIREs Comput. Mol. Sci. 3, 560–593 (2013)CrossRefGoogle Scholar
  10. 10.
    Cordella, L.P., Pasquale, F., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004)CrossRefGoogle Scholar
  11. 11.
    Crabtree, J., Mehta, D., Kouri, T.: An open-source java platform for automated reaction mapping. J. Chem. Inf. Model 50, 1751–1756 (2010)CrossRefGoogle Scholar
  12. 12.
    Degenhardt, J., Köllner, T.G., Gershenzon, J.: Monoterpene and sesquiterpene synthases and the origin of terpene skeletal diversity in plants. Phytochem 70, 1621–1637 (2009)CrossRefGoogle Scholar
  13. 13.
    Ehrlich, H.C., Rarey, M.: Maximum common subgraph isomorphism algorithms and their applications in molecular science: a review. WIREs Comput. Mol. Sci. 1, 68–79 (2011). doi: 10.1002/wcms.5 CrossRefGoogle Scholar
  14. 14.
    Feist, A.M., Herrgøard, M.J., Thiele, I., Reed, J.L., Palsson, B.Ø.: Reconstruction of biochemical networks in microorganisms. Nat. Rev. Microbiol. 7, 129–143 (2009)CrossRefGoogle Scholar
  15. 15.
    First, E.L., Gounaris, C.E., Floudas, C.A.: Stereochemically consistent reaction mapping and identification of multiple reaction mechanisms through integer linear optimization. J. Chem. Inf. Model 52, 84–92 (2012)CrossRefGoogle Scholar
  16. 16.
    Fujita, S.: Description of organic reactions based on imaginary transition structures. 1. Introduction of new concepts. J. Chem. Inf. Comput. Sci. 26, 205–212 (1986)CrossRefGoogle Scholar
  17. 17.
    Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13(1), 113–129 (2010)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Hendrickson, J.B.: Comprehensive system for classification and nomenclature of organic reactions. J. Chem. Inf. Comput. Sci. 37, 852–860 (1997)CrossRefGoogle Scholar
  19. 19.
    Herges, R.: Organizing principle of complex reactions and theory of coarctate transition states. Angew. Chem. Int. Ed. 33, 255–276 (1994)CrossRefGoogle Scholar
  20. 20.
    Jeltsch, E., Kreowski, H.J.: Grammatical inference based on hyperedge replacement. In: Ehrig, H., Kreowski, H.-J., Rozenberg, G. (eds.) Graph Grammars 1990. LNCS, vol. 532, pp. 461–474. Springer, Heidelberg (1991)CrossRefGoogle Scholar
  21. 21.
    Justice, D., Hero, A.: A binary linear programming formulation of the graph edit distance. IEEE Trans. Pattern Anal. Mach. Intell. 28(8), 1200–1214 (2006)CrossRefGoogle Scholar
  22. 22.
    Latendresse, M., Malerich, J.P., Travers, M., Karp, P.D.: Accurate atom-mapping computation for biochemical reactions. J. Chem. Inf. Model 52, 2970–2982 (2012)CrossRefGoogle Scholar
  23. 23.
    Mann, M., Nahar, F., Schnorr, N., Backofen, R., Stadler, P.F., Flamm, C.: Atom mapping with constraint programming. Alg. Mol. Biol. 9, 23 (2014)CrossRefGoogle Scholar
  24. 24.
    Morgat, A., Axelsen, K.B., Lombardot, T., Alcntara, R., Aimo, L., Zerara, M., Niknejad, A., Belda, E., Hyka-Nouspikel, N., Coudert, E., Redaschi, N., Bougueleret, L., Steinbeck, C., Xenarios, I., Bridge, A.: Updates in rhea a manually curated resource of biochemical reactions. Nucleic Acids Res. 43(D1), 459–464 (2015)CrossRefGoogle Scholar
  25. 25.
    Prigent, S., Collet, G., Dittami, S.M., Delage, L., Ethis de Corny, F., Dameron, O., Eveillard, D., Thiele, S., Cambefort, J., Boyen, C., Siegel, A., Tonon, T.: The genome-scale metabolic network of Ectocarpus siliculosus (EctoGEM): a resource to study brown algal physiology and beyond. Plant J. 80(2), 367–381 (2014)CrossRefGoogle Scholar
  26. 26.
    Raymond, J.W., Willett, P.: Maximum common subgraph isomorphism algorithms for the matching of chemical structures. J. Comput. Aided Mol. Des. 16(7), 521–533 (2002)CrossRefGoogle Scholar
  27. 27.
    Riesen, K., Bunke, H.: Approximate graph edit distance computation by means of bipartite graph matching. Image Vis. Comput. 27(7), 950–959 (2009). 7th IAPR-TC15 Workshop on Graph-based Representations (GbR 2007)CrossRefGoogle Scholar
  28. 28.
    Schaub, T., Thiele, S.: Metabolic network expansion with answer set programming. In: Hill, P.M., Warren, D.S. (eds.) ICLP 2009. LNCS, vol. 5649, pp. 312–326. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  29. 29.
    Veblen, O.: An application of modular equations in analysis situs. Ann. Math. 14, 86–94 (1912)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Warr, W.A.: A short review of chemical reaction database systems, computer-aided synthesis design, reaction prediction and synthetic feasibility. Mol. Inform. 33, 469–476 (2014)CrossRefGoogle Scholar
  31. 31.
    Wittig, U., Rey, M., Kania, R., Bittkowski, M., Shi, L., Golebiewski, M., Weidemann, A., Müller, W., Rojas, I.: Challenges for an enzymatic reaction kinetics database. FEBS J. 281, 572–582 (2014)CrossRefGoogle Scholar
  32. 32.
    Yadav, M.K., Kelley, B.P., Silverman, S.M.: The potential of a chemical graph transformation system. In: Ehrig, H., Engels, G., Parisi-Presicce, F., Rozenberg, G. (eds.) ICGT 2004. LNCS, vol. 3256, pp. 83–95. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  33. 33.
    Yoder, R.A., Johnston, J.N.: A case study in biomimetic total synthesis: polyolefin carbocyclizations to terpenes and steroids. Chem. Rev. 105, 4730–4756 (2005)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Christoph Flamm
    • 2
    • 8
  • Daniel Merkle
    • 1
    Email author
  • Peter F. Stadler
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
  • Uffe Thorsen
    • 1
  1. 1.Department of Mathematics and Computer ScienceUniversity of Southern DenmarkOdenseDenmark
  2. 2.Institute for Theoretical ChemistryUniversity of ViennaWienAustria
  3. 3.Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for BioinformaticsUniversity of LeipzigLeipzigGermany
  4. 4.Max Planck Institute for Mathematics in the SciencesLeipzigGermany
  5. 5.Fraunhofer Institute for Cell Therapy and ImmunologyLeipzigGermany
  6. 6.Center for Non-coding RNA in Technology and HealthUniversity of CopenhagenFrederiksbergDenmark
  7. 7.Santa Fe InstituteSanta Fe NmUSA
  8. 8.Research Network Chemistry Meets MicrobiologyUniversity of ViennaWienAustria

Personalised recommendations