Finding Maximum Colorful Subtrees in Practice

  • Imran Rauf
  • Florian Rasche
  • François Nicolas
  • Sebastian Böcker
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7262)

Abstract

In metabolomics and other fields dealing with small compounds, mass spectrometry is applied as sensitive high-throughput technique. Recently, fragmentation trees have been proposed to automatically analyze the fragmentation mass spectra recorded by such instruments. Computationally, this leads to the problem of finding a maximum weight subtree in an edge weighted and vertex colored graph, such that every color appears at most once in the solution.

We introduce new heuristics and an exact algorithm for this Maximum Colorful Subtree problem, and evaluate them against existing algorithms on real-world datasets. Our tree completion heuristic consistently scores better than other heuristics, while the integer programming-based algorithm produces optimal trees with modest running times. Our fast and accurate heuristic can help to determine molecular formulas based on fragmentation trees. On the other hand, optimal trees from the integer linear program are useful if structure is relevant, e.g., for tree alignments.

Keywords

Directed Acyclic Graph Integer Linear Program Tandem Mass Spectrum Input Graph Greedy Heuristic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Björklund, A., Husfeldt, T., Kaski, P., Koivisto, M.: Fourier meets Möbius: fast subset convolution. In: Proc. of ACM Symposium on Theory of Computing (STOC 2007), pp. 67–74. ACM Press, New York (2007)CrossRefGoogle Scholar
  2. 2.
    Böcker, S., Rasche, F.: Towards de novo identification of metabolites by analyzing tandem mass spectra. Bioinformatics 24, I49–I55 (2008)CrossRefGoogle Scholar
  3. 3.
    Dondi, R., Fertin, G., Vialette, S.: Maximum Motif Problem in Vertex-Colored Graphs. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 221–235. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  4. 4.
    Dreyfus, S.E., Wagner, R.A.: The Steiner problem in graphs. Networks 1(3), 195–207 (1972)MathSciNetMATHCrossRefGoogle Scholar
  5. 5.
    Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26(2), 141–167 (2006)MathSciNetMATHCrossRefGoogle Scholar
  6. 6.
    Fernie, A.R., Trethewey, R.N., Krotzky, A.J., Willmitzer, L.: Metabolite profiling: from diagnostics to systems biology. Nat. Rev. Mol. Cell Biol. 5(9), 763–769 (2004)CrossRefGoogle Scholar
  7. 7.
    Guillemot, S., Sikora, F.: Finding and Counting Vertex-Colored Subtrees. In: Hliněný, P., Kučera, A. (eds.) MFCS 2010. LNCS, vol. 6281, pp. 405–416. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  8. 8.
    Hill, D.W., Kertesz, T.M., Fontaine, D., Friedman, R., Grant, D.F.: Mass spectral metabonomics beyond elemental formula: Chemical database querying by matching experimental with computational fragmentation spectra. Anal. Chem. 80(14), 5574–5582 (2008)CrossRefGoogle Scholar
  9. 9.
    Ito, T.: Finding maximum weight arborescence in an edge-weighted DAG. Theoretical Computer Science – Stack Exchange, http://cstheory.stackexchange.com/q/4088/189 (retrieved: October 12, 2011)
  10. 10.
    Koutis, I., Williams, R.: Limits and Applications of Group Algebras for Parameterized Problems. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009. LNCS, vol. 5555, pp. 653–664. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  11. 11.
    Li, J.W.-H., Vederas, J.C.: Drug discovery and natural products: end of an era or an endless frontier? Science 325(5937), 161–165 (2009)CrossRefGoogle Scholar
  12. 12.
    Ljubić, I., Weiskircher, R., Pferschy, U., Klau, G.W., Mutzel, P., Fischetti, M.: Solving the prize-collecting Steiner tree problem to optimality. In: Proc. of Algorithm Engineering and Experiments (ALENEX 2005), pp. 68–76. SIAM (2005)Google Scholar
  13. 13.
    Oberacher, H., Pavlic, M., Libiseller, K., Schubert, B., Sulyok, M., Schuhmacher, R., Csaszar, E., Köfeler, H.C.: On the inter-instrument and inter-laboratory transferability of a tandem mass spectral reference library: 1. results of an Austrian multicenter study. J. Mass Spectrom. 44(4), 485–493 (2009)CrossRefGoogle Scholar
  14. 14.
    Rasche, F., Scheubert, K., Hufsky, F., Zichner, T., Kai, M., Svatoš, A., Böcker, S.: Identifying the unknowns by aligning fragmentation trees (October 2011) (manuscript)Google Scholar
  15. 15.
    Rasche, F., Svatoš, A., Maddula, R.K., Böttcher, C., Böcker, S.: Computing fragmentation trees from tandem mass spectrometry data. Anal. Chem. 83, 1243–1251 (2011)CrossRefGoogle Scholar
  16. 16.
    Scheubert, K., Hufsky, F., Rasche, F., Böcker, S.: Computing fragmentation trees from metabolite multiple mass spectrometry data. J. Comput. Biol. 18(11), 1383–1397 (2011)CrossRefGoogle Scholar
  17. 17.
    Scheubert, K., Hufsky, F., Rasche, F., Böcker, S.: Computing Fragmentation Trees from Metabolite Multiple Mass Spectrometry Data. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 377–391. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  18. 18.
    Sikora, F.: An (almost complete) state of the art around the graph motif problem. Technical report, Université Paris-Est, France (2010), http://www-igm.univ-mlv.fr/~fsikora/pub/GraphMotif-Resume.pdf
  19. 19.
    Sikora, F.: Aspects algorithmiques de la comparaison d’éléments biologiques. PhD thesis, Université Paris-Est (2011)Google Scholar
  20. 20.
    Xu, K., Li, W.: Many hard examples in exact phase transitions. Theor. Comput. Sci. 355(3), 291–302 (2006)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Imran Rauf
    • 1
  • Florian Rasche
    • 2
  • François Nicolas
    • 2
  • Sebastian Böcker
    • 2
  1. 1.Department of Computer ScienceUniversity of KarachiKarachiPakistan
  2. 2.Lehrstuhl für BioinformatikFriedrich-Schiller-Universität JenaJenaGermany

Personalised recommendations