Abstract
Let \({{\mathcal {T}}}_1\) and \({{\mathcal {T}}}_2\) be two rooted trees with an equal number of leaves. The leaves are labeled, and the labeling of the leaves in \({{\mathcal {T}}}_2\) is a permutation of those in \({{\mathcal {T}}}_1\). Nodes are associated with weight, such that the weight of a node u, denoted by W(u), is more than the weight of its parent. A node \(x \in {{\mathcal {T}}}_1\) and a node \(y \in {{\mathcal {T}}}_2\) are induced, iff their subtrees have at least one common leaf label. A heaviest induced ancestor query \(\mathsf {HIA} (u_1,u_2)\) with input nodes \(u_1 \in {{\mathcal {T}}}_1\) and \(u_2 \in {{\mathcal {T}}}_2\) asks to output the pair \((u_1^*,u_2^*)\) of induced nodes with the highest combined weight \(\mathsf {W} (u^*_1) + \mathsf {W} (u^*_2)\), such that \(u_1^*\) is an ancestor of \(u_1\) and \(u^*_2\) is an ancestor of \(u_2\). This is a useful primitive in several text processing applications. Gagie et al. (Proceedings of the 25th Canadian Conference on Computational Geometry, CCCG 2013, Waterloo, Ontario, Canada, 2013) introduced this problem and proposed three data structures with the following space-time trade-offs: (i) \(O(n\log ^2n)\) space and \(O(\log n \log \log n)\) query time, (ii) \(O(n\log n)\) space and \(O(\log ^2 n)\) query time, and (iii) O(n) space and \(O(\log ^{3+\epsilon }n)\) query time. Here n is the number of nodes in both trees combined and \(\epsilon >0\) is an arbitrarily small constant. We present two new data structures with better space-time trade-offs: (i) \(O(n\log n)\) space and \(O(\log n \log \log n)\) query time, and (ii) O(n) space and \(O({\log ^2 n}/{\log \log n})\) query time. Additionally, we present new applications of these results.
Similar content being viewed by others
References
Abedin, P., Hooshmand, S., Ganguly, A., Thankachan, S.V.: The heaviest induced ancestors problem revisited. In: Navarro, G., Sankoff, D., Zhu, B. (eds.) Annual Symposium on Combinatorial Pattern Matching, CPM 2018, July 2–4, 2018 - Qingdao, China, LIPIcs, vol. 105, pp. 20:1–20:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018). https://doi.org/10.4230/LIPIcs.CPM.2018.20
Aluru, S.: Handbook of Computational Molecular Biology. Chapman and Hall/CRC (2005)
Amir, A., Boneh, I.: Locally maximal common factors as a tool for efficient dynamic string algorithms. In: Navarro, G., Sankoff, D., Zhu, B. (eds.) Annual Symposium on Combinatorial Pattern Matching, CPM 2018, July 2–4, 2018 - Qingdao, China, LIPIcs, vol. 105, pp. 11:1–11:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018). https://doi.org/10.4230/LIPIcs.CPM.2018.11
Amir, A., Boneh, I., Charalampopoulos, P., Kondratovsky, E.: Repetition detection in a dynamic string. In: Bender, M.A., Svensson, O., Herman, G. (eds.) 27th Annual European Symposium on Algorithms, ESA 2019, September 9–11, 2019, Munich/Garching, Germany, LIPIcs, vol. 144, pp. 5:1–5:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPIcs.ESA.2019.5
Amir, A., Charalampopoulos, P., Iliopoulos, C.S., Pissis, S.P., Radoszewski, J.: Longest common factor after one edit operation. In: Fici, G., Sciortino, M., Venturini, R. (eds.) String Processing and Information Retrieval - 24th International Symposium, SPIRE 2017, Palermo, Italy, September 26–29, 2017, Proceedings, Lecture Notes in Computer Science, vol. 10508, pp. 14–26. Springer (2017). https://doi.org/10.1007/978-3-319-67428-5_2
Amir, A., Charalampopoulos, P., Pissis, S.P., Radoszewski, J.: Longest common substring made fully dynamic. In: Bender, M.A., Svensson, O., Herman, G. (eds.) 27th Annual European Symposium on Algorithms, ESA 2019, September 9–11, 2019, Munich/Garching, Germany, LIPIcs, vol. 144, pp. 6:1–6:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPIcs.ESA.2019.6
Amir, A., Charalampopoulos, P., Pissis, S.P., Radoszewski, J.: Dynamic and internal longest common substring. Algorithmica 82(12), 3707–3743 (2020). https://doi.org/10.1007/s00453-020-00744-0
Amir, A., Kondratovsky, E.: Searching for a modified pattern in a changing text. In: Gagie, T., Moffat, A., Navarro, G., Cuadros-Vargas, E. (eds.) String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Lima, Peru, October 9–11, 2018, Proceedings, Lecture Notes in Computer Science, vol. 11147, pp. 241–253. Springer (2018). https://doi.org/10.1007/978-3-030-00479-8_20
Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Panario, D., Viola, A. (eds.) LATIN 2000: Theoretical Informatics, 4th Latin American Symposium, Punta del Este, Uruguay, April 10–14, 2000, Proceedings, Lecture Notes in Computer Science, vol. 1776, pp. 88–94. Springer (2000). https://doi.org/10.1007/10719839_9
Brodal, G.S., Jørgensen, A.G.: Data structures for range median queries. In: Dong, Y., Du, D., Ibarra, O.H. (eds.) Algorithms and Computation, 20th International Symposium, ISAAC 2009, Honolulu, Hawaii, USA, December 16–18, 2009. Proceedings, Lecture Notes in Computer Science, vol. 5878, pp. 822–831. Springer (2009). https://doi.org/10.1007/978-3-642-10631-6_83
Chan, T.M., Larsen, K.G., Patrascu, M.: Orthogonal range searching on the RAM, revisited. In: Proceedings of the 27th ACM Symposium on Computational Geometry, Paris, France, June 13–15, 2011, pp. 1–10 (2011). https://doi.org/10.1145/1998196.1998198.
Charalampopoulos, P., Gawrychowski, P., Pokorski, K.: Dynamic longest common substring in polylogarithmic time. In: Czumaj, A., Dawar, A., Merelli, E. (eds.) 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020, July 8–11, 2020, Saarbrücken, Germany (Virtual Conference), LIPIcs, vol. 168, pp. 27:1–27:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020). https://doi.org/10.4230/LIPIcs.ICALP.2020.27
Chockalingam, S.P., Thankachan, S.V., Aluru, S.: A parallel algorithm for finding all pairs k-mismatch maximal common substrings. In: West, J., Pancake, C.M. (eds.) Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, Salt Lake City, UT, USA, November 13–18, 2016, pp. 784–794. IEEE Computer Society (2016). https://doi.org/10.1109/SC.2016.66
Demaine, E.D., Landau, G.M., Weimann, O.: On cartesian trees and range minimum queries. Algorithmica 68(3), 610–625 (2014). https://doi.org/10.1007/s00453-012-9683-x
Farach, M.: Optimal suffix tree construction with large alphabets. In: 38th Annual Symposium on Foundations of Computer Science, FOCS ’97, Miami Beach, Florida, USA, October 19–22, 1997, pp. 137–143. IEEE Computer Society (1997). https://doi.org/10.1109/SFCS.1997.646102
Ferragina, P., Manzini, G.: Indexing compressed text. J. ACM 52(4), 552–581 (2005). https://doi.org/10.1145/1082036.1082039
Fischer, J., Heun, V.: Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM J. Comput. 40(2), 465–492 (2011). https://doi.org/10.1137/090779759
Funakoshi, M., Mieno, T.: Minimal unique palindromic substrings after single-character substitution. In: Lecroq, T., Touzet, H. (eds.) String Processing and Information Retrieval - 28th International Symposium, SPIRE 2021, Lille, France, October 4–6, 2021, Proceedings, Lecture Notes in Computer Science, vol. 12944, pp. 33–46. Springer (2021). https://doi.org/10.1007/978-3-030-86692-1_4
Funakoshi, M., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Longest substring palindrome after edit. In: Navarro, G., Sankoff, D., Zhu, B. (eds.) Annual Symposium on Combinatorial Pattern Matching, CPM 2018, July 2–4, 2018 - Qingdao, China, LIPIcs, vol. 105, pp. 12:1–12:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018). https://doi.org/10.4230/LIPIcs.CPM.2018.12
Funakoshi, M., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Faster queries for longest substring palindrome after block edit. In: Pisanti, N., Pissis, S.P. (eds.) 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019, June 18–20, 2019, Pisa, Italy, LIPIcs, vol. 128, pp. 27:1–27:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPIcs.CPM.2019.27
Funakoshi, M., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Computing longest palindromic substring after single-character or block-wise edits. Theor. Comput. Sci. 859, 116–133 (2021). https://doi.org/10.1016/j.tcs.2021.01.014
Gagie, T., Gawrychowski, P., Nekrich, Y.: Heaviest induced ancestors and longest common substrings. In: Proceedings of the 25th Canadian Conference on Computational Geometry, CCCG 2013, Waterloo, Ontario, Canada, August 8–10, 2013. Carleton University, Ottawa, Canada (2013). http://cccg.ca/proceedings/2013/papers/paper_29.pdf
Grossi, R., Vitter, J.S.: Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM J. Comput. 35(2), 378–407 (2005). https://doi.org/10.1137/S0097539702402354
Gusfield, D.: Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984). https://doi.org/10.1137/0213024
JáJá, J., Mortensen, C.W., Shi, Q.: Space-efficient and fast algorithms for multidimensional dominance reporting and counting. In: Fleischer, R., Trippen, G. (eds.) Algorithms and Computation, 15th International Symposium, ISAAC 2004, Hong Kong, China, December 20–22, 2004, Proceedings, Lecture Notes in Computer Science, vol. 3341, pp. 558–568. Springer (2004). https://doi.org/10.1007/978-3-540-30551-4_49
Nekrich, Y., Navarro, G.: Sorted range reporting. In: Fomin, F.V., Kaski, P. (eds.) Algorithm Theory - SWAT 2012 - 13th Scandinavian Symposium and Workshops, Helsinki, Finland, July 4–6, 2012. Proceedings, Lecture Notes in Computer Science, vol. 7357, pp. 271–282. Springer (2012). https://doi.org/10.1007/978-3-642-31155-0_24
Sadakane, K.: Succinct representations of lcp information and improvements in the compressed suffix arrays. In: Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 6–8, 2002, San Francisco, CA, USA., pp. 225–232 (2002). http://dl.acm.org/citation.cfm?id=545381.545410
Sadakane, K., Navarro, G.: Fully-functional succinct trees. In: Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17–19, 2010, pp. 134–149 (2010). https://doi.org/10.1137/1.9781611973075.13
Sleator, D.D., Tarjan, R.E.: A data structure for dynamic trees. In: Proceedings of the 13th Annual ACM Symposium on Theory of Computing, May 11–13, 1981, Milwaukee, Wisconsin, USA, pp. 114–122 (1981). https://doi.org/10.1145/800076.802464
Thankachan, S.V., Chockalingam, S.P., Aluru, S.: An efficient algorithm for finding all pairs k-mismatch maximal common substrings. In: Bourgeois, A.G., Skums, P., Wan, X., Zelikovsky, A. (eds.) Bioinformatics Research and Applications - 12th International Symposium, ISBRA 2016, Minsk, Belarus, June 5–8, 2016, Proceedings, Lecture Notes in Computer Science, vol. 9683, pp. 3–14. Springer (2016). https://doi.org/10.1007/978-3-319-38782-6_1
Urabe, Y., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Longest lyndon substring after edit. In: Navarro, G., Sankoff, D., Zhu, B. (eds.) Annual Symposium on Combinatorial Pattern Matching, CPM 2018, July 2–4, 2018 - Qingdao, China, LIPIcs, vol. 105, pp. 19:1–19:10. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018). https://doi.org/10.4230/LIPIcs.CPM.2018.19
Weiner, P.: Linear pattern matching algorithms. In: 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, October 15–17, 1973, pp. 1–11 (1973). https://doi.org/10.1109/SWAT.1973.13
Willard, D.E.: Log-logarithmic worst-case range queries are possible in space theta(n). Inf. Process. Lett. 17(2), 81–84 (1983). https://doi.org/10.1016/0020-0190(83)90075-3
Zhou, G.: Two-dimensional range successor in optimal time and almost linear space. Inf. Process. Lett. 116(2), 171–174 (2016). https://doi.org/10.1016/j.ipl.2015.09.002
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977). https://doi.org/10.1109/TIT.1977.1055714
Acknowledgements
This research is supported in part by the U.S. National Science Foundation under the grants CCF-1703489 and CCF-2112643.
Author information
Authors and Affiliations
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
An early version of this work appeared in CPM 2018 [1]
Rights and permissions
About this article
Cite this article
Abedin, P., Hooshmand, S., Ganguly, A. et al. The Heaviest Induced Ancestors Problem: Better Data Structures and Applications. Algorithmica 84, 2088–2105 (2022). https://doi.org/10.1007/s00453-022-00955-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-022-00955-7