On Bubble Generators in Directed Graphs
Bubbles are pairs of internally vertex-disjoint (s, t)-paths with applications in the processing of DNA and RNA data. For example, enumerating alternative splicing events in a reference-free context can be done by enumerating all bubbles in a de Bruijn graph built from RNA-seq reads . However, listing and analysing all bubbles in a given graph is usually unfeasible in practice, due to the exponential number of bubbles present in real data graphs. In this paper, we propose a notion of a bubble generator set, i.e. a polynomial-sized subset of bubbles from which all the others can be obtained through the application of a specific symmetric difference operator. This set provides a compact representation of the bubble space of a graph, which can be useful in practice since some pertinent information about all the bubbles can be more conveniently extracted from this compact set. Furthermore, we provide a polynomial-time algorithm to decompose any bubble of a graph into the bubbles of such a generator in a tree-like fashion.
KeywordsBubbles Bubble generator set Bubble space Decomposition algorithm
V. Acuña was supported by Fondecyt 1140631, CIRIC-INRIA Chile and Basal Project PBF 03. R. Grossi and G.F. Italiano were partially supported by MIUR, the Italian Ministry of Education, University and Research, under the Project AMANDA (Algorithmics for MAssive and Networked DAta). Part of this work was done while G.F. Italiano was visiting Université de Lyon. L. Lima is supported by the Brazilian Ministry of Science, Technology and Innovation (in portuguese, Ministério da Ciência, Tecnologia e Inovação - MCTI) through the National Counsel of Technological and Scientific Development (in portuguese, Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq), under the Science Without Borders (in portuguese, Ciências Sem Fronteiras) scholarship grant process number 203362/2014-4. B. Sinaimeri, L. Lima and M.-F. Sagot are partially funded by the French ANR project Aster (2016–2020), and together with V. Acuña, also by the Stic AmSud project MAIA (2016–2017). This work was performed using the computing facilities of the CC LBBE/PRABI.
- 11.Lima, L., Sinaimeri, B., Sacomoto, G., Lopez-Maestre, H., Marchet, C., Miele, V., Sagot, M.F., Lacroix, V.: Playing hide and seek with repeats in local and global de novo transcriptome assembly of short RNA-seq reads. Algorithms Mol. Biol. 12(1), 2:1–2:19 (2017). doi: 10.1186/s13015-017-0091-2 CrossRefGoogle Scholar
- 15.Sacomoto, G., Lacroix, V., Sagot, M.-F.: A polynomial delay algorithm for the enumeration of bubbles with length constraints in directed graphs and its application to the detection of alternative splicing in RNA-seq data. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 99–111. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40453-5_9 CrossRefGoogle Scholar
- 16.Sacomoto, G., Kielbassa, J., Chikhi, R., Uricaru, R., Antoniou, P., Sagot, M.F., Peterlongo, P., Lacroix, V.: KISSPLICE: de-novo calling alternative splicing events from RNA-seq data. BMC Bioinf. 13(S–6), S5 (2012)Google Scholar