Abstract
Binary jumbled pattern matching asks to preprocess a binary string \(S\) in order to answer queries \((i,j)\) which ask for a substring of \(S\) that is of length \(i\) and has exactly \(j\) 1-bits. This problem naturally generalizes to vertex-labeled trees and graphs by replacing “substring” with “connected subgraph”. In this paper, we give an \(O(n^2 / \log ^2 n)\)-time solution for trees, matching the currently best bound for (the simpler problem of) strings. We also give an \({O}({g^{2 / 3} n^{4 / 3}/(\log n)^{4/3}})\)-time solution for strings that are compressed by a context-free grammar of size \(g\) in Chomsky normal form. This solution improves the known bounds when the string is compressible under many popular compression schemes. Finally, we prove that on graphs the problem is fixed-parameter tractable with respect to the treewidth \(w\) of the graph, even for a constant number of different vertex-labels, thus improving the previous best \(n^{O(w)}\) algorithm.
Similar content being viewed by others
Notes
The root of the macro tree is an exception as it might have a top boundary node connected to two (rather than one) child micro trees. We focus on the other nodes. Handling the root is done in a very similar way.
A centroid decomposition can be found in linear time.
The difference between the meaning of the query here and elsewhere in the paper is for ease of the presentation.
Here we slightly abuse our terminology and allow \(X_0\) to be the empty set.
There is a simple linear-time algorithm that given a tree finds the best way to share identical rooted subtrees [19].
References
Alstrup, S., Secher, J., Sporkn, M.: Optimal on-line decremental connectivity in trees. Inf. Process. Lett. 64(4), 161–164 (1997)
Ambalath, A.M., Balasundaram, R., Rao, C.H., Koppula, V., Misra, N., Philip, G., Ramanujan, M.S.: On the kernelization complexity of colorful motifs. In: Proceedings of the 5th International Symposium Parameterized and Exact Computation, pp. 14–25, (2010)
Amir, A., Chan, T.M., Lewenstein, M., Lewenstein, N.: On hardness of jumbled indexing. In: Proceedings of the 41st International Colloquium on Automata, Languages and Programming (ICALP), pp. 114–125, (2014)
Badkobeh, G., Fici, G., Kroon, S., Lipták, Z.: Binary jumbled string matching for highly run-length compressible texts. Inf. Process. Lett. 113(17), 604–608 (2013)
Benson, G.: Composition alignment. In: Proceedings of the 3rd International Workshop on Algorithms in Bioinformatics (WABI), pp. 447–461, (2003)
Betzler, N., van Bevern, R., Fellows, M.R., Komusiewicz, C., Niedermeier, R.: Parameterized algorithmics for finding connected motifs in biological networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(5), 1296–1308 (2011)
Böcker, S.: Simulating multiplexed SNP discovery rates using base-specific cleavage and mass spectrometry. Bioinformatics 23(2), 5–12 (2007)
Bodlaender, H.L.: A linear time algorithm for finding tree-decompositions of small treewidth. SIAM J. Comput. 25, 1305–1317 (1996)
Bodlaender, H.L.: Treewidth. Algorithmic techniques and results. In: Proceedings of the 22nd International Symposium on Mathematical Foundations of Computer Science (MFCS), pp. 19–36, (1997)
Burcsi, P., Cicalese F., Fici G., Lipták, Z.: On table arrangement, scrabble freaks, and jumbled pattern matching. In: Proceedings of the Symposium on Fun with Algorithms, pp. 89–101, (2010)
Burcsi, P., Cicalese, F., Fici, G., Lipták, Z.: Algorithms for jumbled pattern matching in strings. Int. J. Found. Comput. Sci. 23(2), 357–374 (2012)
Burcsi, P., Cicalese, F., Fici, G., Lipták, Z.: On approximate jumbled pattern matching in strings. Theory Comput. Syst. 50(1), 35–51 (2012)
Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theory 51(7), 2554–2576 (2005)
Cicalese, F., Fici, G., Lipták, Z.: Searching for jumbled patterns in strings. In: Proceedings of the Prague Stringology Conference, pp. 105–117, (2009)
Cicalese, F., Gagie, T., Giaquinta, E., Laber, E., Liptak, S., Rizzi, R., Tomescu, A.I.: Indexes for jumbled pattern matching in strings, trees and graphs. In: Proceedings of the 20th International Symposium on String Processing and Information Retrieval (SPIRE), pp. 56–63, (2013)
Cicalese, F., Laber, E.S., Weimann, O., Yuster, R.: Near linear time construction of an approximate index for all maximum consecutive sub-sums of a sequence. In: Proceedings of the Symposium on Combinatorial Pattern Matching, pp. 149–158, (2012)
Dondi, R., Fertin, G., Vialette, S.: Complexity issues in vertex-colored graph pattern matching. J. Discrete Algorithms 9(1), 82–99 (2011)
Dondi, R., Fertin, G., Vialette, S.: Finding approximate and constrained motifs in graphs. In: Proceedings of the 22nd Annual Symposium Combinatorial Pattern Matching, pp. 388–401, (2011)
Downey, P.J., Sethi, R., Tarjan, R.E.: Variations on the common subexpression problem. J. ACM 27, 758–771 (1980)
Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Berlin (1999)
Fellows, M.R., Fertin, G., Hermelin, D., Vialette, S.: Upper and lower bounds for finding connected motifs in vertex-colored graphs. J. Comput. Syst. Sci. 77(4), 799–811 (2011)
Gawrychowski, P.: Faster algorithm for computing the edit distance between SLP-compressed strings. In: Proceedings of the Symposium on String Processing and Information Retrieval, pp. 229–236, (2012)
Giaquinta, E., Grabowski, S.: New algorithms for binary jumbled pattern matching. Inf. Process. Lett. 113(14–16), 538–542 (2013)
Jacobson, G.: Space-efficient static trees and graphs. In: Proceedings of the 30th Annual Symposium on Foundations of Computer Science (FOCS), pp. 549–554, (1989)
Kociumaka, T., Radoszewski, J., Rytter, W.: Efficient indexes for jumbled pattern matching with constant-sized alphabet. In: ESA, pp. 625–636, (2013)
Lacroix, V., Fernandes, C.G., Sagot, M.-F.: Motif search in graphs: application to metabolic networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 3(4), 360–368 (2006)
Moosa, T.M., Rahman, M.S.: Indexing permutations for binary strings. Inf. Process. Lett. 110(18–19), 795–798 (2010)
Moosa, T.M., Rahman, M.S.: Sub-quadratic time and linear space data structures for permutation matching in binary strings. J. Discrete Algorithms 10, 5–9 (2012)
Munro, I.: Tables. In: Proceedings of the 16th Foundations of Software Technology and Theoretical Computer Science (FSTTCS), pp. 37–42, (1996)
Rytter, W.: Application of Lempel–Ziv factorization to the approximation of grammar-based compression. Theoret. Comput. Sci. 302(1–3), 211–222 (2003)
Acknowledgments
We thank the anonymous reviewers for their helpful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Preliminary version of this paper appeared in the 21st Annual European Symposium on Algorithms (ESA 2013).
Gad M. Landau: Supported in part by the National Science Foundation (NSF) Grant 0904246, the Israel Science Foundation (ISF) Grant 347/09, and the United States-Israel Binational Science Foundation (BSF) Grant 2008217. Oren Weimann: Supported in part by the Israel Science Foundation Grant 794/13.
Rights and permissions
About this article
Cite this article
Gagie, T., Hermelin, D., Landau, G.M. et al. Binary Jumbled Pattern Matching on Trees and Tree-Like Structures. Algorithmica 73, 571–588 (2015). https://doi.org/10.1007/s00453-014-9957-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-014-9957-6