Advertisement

Algorithmica

pp 1–17 | Cite as

On Bubble Generators in Directed Graphs

  • V. Acuña
  • R. Grossi
  • G. F. Italiano
  • L. Lima
  • R. Rizzi
  • G. Sacomoto
  • M.-F. Sagot
  • B. SinaimeriEmail author
Article
  • 27 Downloads

Abstract

Bubbles are pairs of internally vertex-disjoint (st)-paths in a directed graph, which have many applications in the processing of DNA and RNA data. Listing and analysing all bubbles in a given graph is usually unfeasible in practice, due to the exponential number of bubbles present in real data graphs. In this paper, we propose a notion of bubble generator set, i.e., a polynomial-sized subset of bubbles from which all the other bubbles can be obtained through a suitable application of a specific symmetric difference operator. This set provides a compact representation of the bubble space of a graph. A bubble generator can be useful in practice, since some pertinent information about all the bubbles can be more conveniently extracted from this compact set. We provide a polynomial-time algorithm to decompose any bubble of a graph into the bubbles of such a generator in a tree-like fashion. Finally, we present two applications of the bubble generator on a real RNA-seq dataset.

Keywords

Bubbles Bubble generator set Decomposition algorithm 

Notes

Acknowledgements

V. Acuña is supported by Fondecyt 1140631, PIA Fellowship AFB170001 and Center for Genome Regulation FONDAP 15090007. R. Grossi and G. F. Italiano are partially supported by MIUR, the Italian Ministry for Education, University and Research, under PRIN Project AHeAD (Efficient Algorithms for HArnessing Networked Data). Part of this work was done while G. F. Italiano was visiting Université de Lyon. L. Lima is supported by the Brazilian Ministry of Science, Technology and Innovation (in portuguese, Ministério da Ciência, Tecnologia e Inovação - MCTI) through the National Counsel of Technological and Scientific Development (in portuguese, Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq), under the Science Without Borders (in portuguese, Ciências Sem Fronteiras) scholarship grant Process No. 203362/2014-4. B. Sinaimeri, L. Lima and M.-F. Sagot are partially funded by the French ANR project Aster (2016–2020), and together with V. Acuña, also by the Stic AmSud project MAIA (2016–2017). This work was performed using the computing facilities of the CC LBBE/PRABI.

References

  1. 1.
    Acuña, V., Grossi, R., Italiano, G.F., Lima, L., Rizzi, R., Sacomoto, G., Sagot, M., Sinaimeri, B.: On bubble generators in directed graphs. In: 43rd International Workshop on Graph-Theoretic Concepts in Computer Science, WG 2017, Eindhoven, The Netherlands, June 21–23 Lecture Notes in Computer Science, vol. 10520, pp. 18–31. Springer (2017)Google Scholar
  2. 2.
    Benoit-Pilven, C., Marchet, C., Chautard, E., Lima, L., Lambert, M.P., Sacomoto, G., Rey, A., Cologne, A., Terrone, S., Dulaurier, L., Claude, J.B., Bourgeois, C., Auboeuf, D., Lacroix, V.: Complementarity of assembly-first and mapping-first approaches for alternative splicing annotation and differential analysis from RNAseq data. Sci. Rep. 8(1), 4307 (2018)CrossRefGoogle Scholar
  3. 3.
    Birmelé, E., Crescenzi, P., Ferreira, R., Grossi, R., Lacroix, V., Marino, A., Pisanti, N., Sacomoto, G., Sagot, M.F.: Efficient bubble enumeration in directed graphs. In: SPIRE, pp. 118–129 (2012)Google Scholar
  4. 4.
    Bollobás, B.: Modern Graph Theory. Graduate Texts in Mathematics, vol. 184. Springer, Berlin (1998)CrossRefzbMATHGoogle Scholar
  5. 5.
    Brankovic, L., Iliopoulos, C.S., Kundu, R., Mohamed, M., Pissis, S.P., Vayani, F.: Linear-time superbubble identification algorithm for genome assembly. Theor. Comput. Sci. 609, 374–383 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. The MIT Electrical Engineering and Computer Science Series. MIT Press, Cambridge (1991)zbMATHGoogle Scholar
  7. 7.
    Deo, N.: Graph Theory with Applications to Engineering and Computer Science. Prentice-Hall Series in Automatic Computation. Prentice-Hall, Englewood Cliffs (1974)zbMATHGoogle Scholar
  8. 8.
    Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., Gingeras, T.R.: Star: ultrafast universal rna-seq aligner. Bioinformatics 29(1), 15–21 (2013)CrossRefGoogle Scholar
  9. 9.
    Gleiss, P.M., Leydold, J., Stadler, P.F.: Circuit bases of strongly connected digraphs. Discuss. Math. Gr. Theory 23(2), 241–260 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Iqbal, Z., Caccamo, M., Turner, I., Flicek, P., McVean, G.: De novo assembly and genotyping of variants using colored de bruijn graphs. Nat. Genet. 44(2), 226–232 (2012)CrossRefGoogle Scholar
  11. 11.
    Kavitha, T., Liebchen, C., Mehlhorn, K., Michail, D., Rizzi, R., Ueckerdt, T., Zweig, K.A.: Cycle bases in graphs characterization, algorithms, complexity, and applications. Comput. Sci. Rev. 3(4), 199–243 (2009).  https://doi.org/10.1016/j.cosrev.2009.08.001 CrossRefzbMATHGoogle Scholar
  12. 12.
    Kavitha, T., Mehlhorn, K.: Algorithms to compute minimum cycle bases in directed graphs. Theory Comput. Syst. 40(4), 485–505 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Lima, L., Sinaimeri, B., Sacomoto, G., Lopez-Maestre, H., Marchet, C., Miele, V., Sagot, M.F., Lacroix, V.: Playing hide and seek with repeats in local and global de novo transcriptome assembly of short RNA-seq reads. Algorithms Mol. Biol. 12, 2–2 (2017).  https://doi.org/10.1186/s13015-017-0091-2 CrossRefGoogle Scholar
  14. 14.
    MacLane, S.: A combinatorial condition for planar graphs. Fundamenta Mathematicae 28, 22–32 (1937)zbMATHGoogle Scholar
  15. 15.
    Miller, J.R., Koren, S., Sutton, G.: Assembly algorithms for next-generation sequencing data. Genomics 95(6), 315–327 (2010)CrossRefGoogle Scholar
  16. 16.
    Onodera, T., Sadakane, K., Shibuya, T.: Detecting superbubbles in assembly graphs. In: Algorithms in Bioinformatics, Lecture Notes in Computer Science, vol. 8126, pp. 338–348. Springer, Berlin (2013)Google Scholar
  17. 17.
    Pevzner, P.A., Tang, H., Tesler, G.: De novo repeat classification and fragment assembly. Genome Res. 14(9), 1786–1796 (2004)CrossRefGoogle Scholar
  18. 18.
    Sacomoto, G., Kielbassa, J., Chikhi, R., Uricaru, R., Antoniou, P., Sagot, M.F., Peterlongo, P., Lacroix, V.: Kissplice: de-novo calling alternative splicing events from rna-seq data. BMC Bioinf. 13(S–6), S5 (2012)Google Scholar
  19. 19.
    Sacomoto, G., Lacroix, V., Sagot, M.F.: A polynomial delay algorithm for the enumeration of bubbles with length constraints in directed graphs and its application to the detection of alternative splicing in RNA-seq data. In: WABI, pp. 99–111 (2013)Google Scholar
  20. 20.
    Sammeth, M.: Complete alternative splicing events are bubbles in splicing graphs. J. Comput. Biol. 16(8), 1117–1140 (2009)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Shilov, G.E.: Linear Algebra. Dover Publications, New York (1977). (Trans. R. A. Silverman)Google Scholar
  22. 22.
    Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J.M., Birol, I.: ABySS: A parallel assembler for short read sequence data. Genome Res. 19(6), 1117–1123 (2009)CrossRefGoogle Scholar
  23. 23.
    Sung, W.K., Sadakane, K., Shibuya, T., Belorkar, A., Pyrogova, I.: An \(O(m \log m)\)-time algorithm for detecting superbubbles. IEEE/ACM Trans. Comput. Biol. Bioinf. 12(4), 770–777 (2015)CrossRefGoogle Scholar
  24. 24.
    Uricaru, R., Rizk, G., Lacroix, V., Quillery, E., Plantard, O., Chikhi, R., Lemaitre, C., Peterlongo, P.: Reference-free detection of isolated SNPs. Nucl. Acids Res. 43(2), e11 (2015)CrossRefGoogle Scholar
  25. 25.
    Younsi, R., MacLean, D.: Using \(2k+2\) bubble searches to find single nucleotide polymorphisms in k-mer graphs. Bioinformatics 31(5), 642–646 (2015)CrossRefGoogle Scholar
  26. 26.
    Zerbino, D., Birney, E.: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Center for Mathematical ModelingUniversidad de Chile and UMI CNRS 2807SantiagoChile
  2. 2.Università di PisaPisaItaly
  3. 3.Erable, INRIAVilleurbanneFrance
  4. 4.LUISS UniversityRomeItaly
  5. 5.Università di VeronaVeronaItaly
  6. 6.Erable INRIA Rhône-Alpes; Université Lyon 1, CNRS, LBBE, UMR 5558VilleurbanneFrance
  7. 7.Università di Roma “Tor Vergata”RomeItaly

Personalised recommendations