Advertisement

Natural Computing

, Volume 12, Issue 4, pp 589–603 | Cite as

On computational complexity of graph inference from counting

  • Szilárd Zsolt Fazekas
  • Hiro Ito
  • Yasushi Okuno
  • Shinnosuke SekiEmail author
  • Kei Taneishi
Article
  • 201 Downloads

Abstract

In de novo drug design, chemical compounds are quantitized as real-valued vectors called chemical descriptors, and an optimization algorithm runs on known drug-like chemical compounds in a database and outputs an optimal chemical descriptor. Since structural information is needed for chemical synthesis, we must infer chemical graphs from the obtained descriptor. This is formalized as a graph inference problem from a real-value vector. By generalizing subword history, which was originally introduced in formal language theory to extract numerical information of words and languages based on counting, we propose a comprehensive framework to investigate the computational complexity of chemical graph inference. We also propose a (pseudo-)polynomial-time algorithm for inferring graphs in a class of practical importance from spectrums.

Keywords

Computational complexity Counting de novo drug design Graph inference Tree-decomposition Spectrum Walk history 

List of symbols

\(\Upsigma\)

Alphabet

\({\overrightarrow{{\mathcal{G}}}}\)

The class of (\(\Upsigma\)-labeled, loopless, (weakly-)connected) directed multigraphs

\({{\mathcal{G}}}\)

The class of (\(\Upsigma\)-labeled, loopless, connected) undirected multigraphs

d(v)

The degree of a vertex v

T

A tree

h(T)

The height of tree T

TK

The T’s frontier vector of level K

tw(G)

The tree-width of G

\(\Upupsilon\)

The class of trees

\(\Upupsilon_h\)

The class of trees of height at most h

\({{\mathcal{SPG}}}\)

The class of series-parallel graphs

\({{\mathcal{PLG}}}\)

The class of planar graphs

\(\mathcal{TW}(w)\)

The class of graphs of tree-width at most w

\({{\overrightarrow{\mathcal{SSG}}}}\)

The class of scattered subword graphs

\({{\overrightarrow{\mathcal{CSG}}}}\)

The class of continuous subword graphs

WH

A walk history

\({{\mathcal{SWH}}}\)

The class of systems of walk histories

\({{\mathcal{SLWH}}}\)

The class of systems of linear walk histories

\({{\mathcal{COUNT}}}\)

The class of counting systems

\(\mathcal{WH}\)

The class of systems of single walk history

\(\mathcal{LWH}\)

The class of systems of single linear walk history

A–F algorithm

Akutsu–Fukagawa algorithm

Notes

Acknowledgements

We wish to express our gratitude for the anonymous referees for their carefully and thoroughly reviewing the earlier version of this manuscript and giving valuable comments and suggestions on it. Shinnosuke Seki expresses his sincere gratitude to Professor Mark Daley, Professor Oscar. H. Ibarra, Professor Helmut Jürgensen, Professor Lila Kari, and Professor Arto Salomaa for the creative discussions with them on the research topic in this paper. This research was carried out with the financial support of the JSPS Postdoctoral Fellowship P10827 to Szilárd Zsolt Fazekas, of the Funding Program for Next Generation World-Leading Researchers (NEXT program) to Yasushi Okuno, and of the Kyoto University Start-up Grant-in-Aid for Young Scientists, No. 021530, to Shinnosuke Seki. Works by Shinnosuke Seki were also financially supported by Department of Information and Computer Science, Aalto University.

References

  1. Akutsu T, Fukagawa D (2005) Inferring a graph from path frequency. In: Aposolico A, Crochemore M, Park K (eds) CPM 2005. Lecture notes in computer science, vol 3537. Springer, New York, pp 371–382Google Scholar
  2. Bakir GH, Weston J, Schölkopf B (2004a) Learning to find pre-images. In: Advances in neural information processing systems, pp 449–456Google Scholar
  3. Bakir GH, Zien A, Tsuda K (2004b) Learning to find graph pre-images. In: Proceedings of the 26th DAGM symposium. Lecture notes in computer science, vol 3175, Springer, New York, pp 253–261Google Scholar
  4. Bodlaender H (1998) A partial k-arboretum of graphs with bounded treewidth. Theor Comput Sci 209(1–2): 1–45MathSciNetCrossRefzbMATHGoogle Scholar
  5. Diestel R (2010) Graph theory, 4th edn. Springer, New YorkCrossRefGoogle Scholar
  6. Fraigniaud P, Nisse N (2006) Connected treewidth and connected graph searching. In: LATIN 2006. Lecture notes in computer science, vol 3887, Springer, New York, pp 479–490Google Scholar
  7. Fujiwara H et al. (2008) Enumerating treelike chemical graphs with given path frequency. J Chem Inf Model 48:1345–1357CrossRefGoogle Scholar
  8. Garey M R, Johnson D S (1979) Computers and intractability. A guide to the theory of NP-completeness. W. H. Freeman and Co, New YorkGoogle Scholar
  9. Goto S et al (2002) LIGAND: Database of chemical compounds and reactions in biological pathways. Nucleic Acids Res 30:402–404CrossRefGoogle Scholar
  10. Ibarra OH (1978) Reversal-bounded multicounter machines and their decision problems. J ACM 25:116–133MathSciNetCrossRefzbMATHGoogle Scholar
  11. Leslie C, Eskin E, Noble WS (2002) The spectrum kernel: a string kernel for SVM protein classification. In: Proceedings of the 7th Pacific symposium on biocomputing. pp 564–575Google Scholar
  12. Mateescu A, Salomaa A, Yu S (2004) Subword histories and Parikh matrices. J Comput Syst Sci 68:1–21MathSciNetCrossRefzbMATHGoogle Scholar
  13. Matiyasevich Y (1970) Solution of the tenth problem of Hilbert. Matematikai Lapok 21:83–87MathSciNetzbMATHGoogle Scholar
  14. Matiyasevich Y (1993) Hilbert’s tenth problem. MIT Press, CambridgeGoogle Scholar
  15. Nagamochi H (2009) A detachment algorithm for inferring a graph from path frequency. Algorithmica 53:207–224MathSciNetCrossRefzbMATHGoogle Scholar
  16. Parikh RJ (1966) On context-free languages. J Assoc Comput Mach 13:570–581MathSciNetCrossRefzbMATHGoogle Scholar
  17. Robertson N, Seymour PD (1986) Graph minors. ii. algorithmic aspects of tree-width. J Algor 7:309–322MathSciNetCrossRefzbMATHGoogle Scholar
  18. Rozenberg G, Salomaa A (eds). (1997) Handbook of formal languages, vol 1. Springer, New YorkGoogle Scholar
  19. Seki S (2011) Absoluteness of subword inequality is undecidable. Theor Comput Sci 418:116-120MathSciNetCrossRefGoogle Scholar
  20. Shannon CS, Weaver W (1949) The mathematical theory of communication. The University of Illinois Press, UrbanaGoogle Scholar
  21. Yamaguchi A, Aoki KF, Mamitsuka H (2003) Graph complexity of chemical compounds in biological pathways. Genome Inform 14:376–377Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2012

Authors and Affiliations

  • Szilárd Zsolt Fazekas
    • 1
  • Hiro Ito
    • 2
  • Yasushi Okuno
    • 3
  • Shinnosuke Seki
    • 4
    Email author
  • Kei Taneishi
    • 3
  1. 1.Nyíregyháza, Mathematics and Informatics InstituteNyíregyházaHungary
  2. 2.School of Informatics and EngineeringThe University of Electro-CommunicationsTokyoJapan
  3. 3.Department of Systems Bioscience for Drug DiscoveryKyoto UniversityKyotoJapan
  4. 4.Department of Information and Computer ScienceAalto UniversityAaltoFinland

Personalised recommendations