Abstract
Elastic-degenerate text provides a novel and effective method for modeling collections of text that have local variations. Due to its applicability in pan-genomics, an index for an elastic-degenerate text which can efficiently report the occurrences of a given query pattern is desirable. This paper attempts to dash our hopes for such an index, one that is deterministic and has good worst-case query time. We do so by providing conditional lower bounds based on the Orthogonal Vectors Hypothesis (OVH) (and hence the Strong Exponential Time Hypothesis). We show that, even with arbitrary polynomial preprocessing time, an index for an elastic-degenerate text with n degenerate letters that can perform queries on a pattern of length m in time for constants \(\alpha \) and \(\beta \) where or would violate OVH. Additionally, we provide an elastic-degenerate text index with query time , which is independent of the size N (distinct from its length) of the elastic-degenerate text. Finally, we investigate the hardness of matching elastic-degenerate text to elastic-degenerate text.
Supported in part by the U.S. National Science Foundation (NSF) under CCF-1703489.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abboud, A., Backurs, A., Hansen, T.D., Williams, V.V., Zamir, O.: Subtree isomorphism revisited. ACM Trans. Algorithms 14(3), 27:1–27:23 (2018). https://doi.org/10.1145/3093239
Abboud, A., Backurs, A., Williams, V.V.: Tight hardness results for LCS and other sequence similarity measures. In: IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17–20 October 2015, pp. 59–78 (2015). https://doi.org/10.1109/FOCS.2015.14
Abboud, A., Bringmann, K., Dell, H., Nederlof, J.: More consequences of falsifying SETH and the orthogonal vectors conjecture. In: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, 25–29 June 2018, pp. 253–266 (2018). https://doi.org/10.1145/3188745.3188938
Alzamel, M., et al.: Degenerate string comparison and applications. In: 18th International Workshop on Algorithms in Bioinformatics, WABI 2018, 20–22 August 2018, Helsinki, Finland, pp. 21:1–21:14 (2018). https://doi.org/10.4230/LIPIcs.WABI.2018.21
Aoyama, K., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Faster online elastic degenerate string matching. In: Annual Symposium on Combinatorial Pattern Matching, CPM 2018, Qingdao, China 2–4 July 2018, pp. 9:1–9:10 (2018). https://doi.org/10.4230/LIPIcs.CPM.2018.9
Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput. 47(3), 1087–1097 (2018). https://doi.org/10.1137/15M1053128
Bernardini, G., Gawrychowski, P., Pisanti, N., Pissis, S.P., Rosone, G.: Even faster elastic-degenerate string matching via fast matrix multiplication. In: 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, Patras, Greece, 9–12 July 2019, pp. 21:1–21:15 (2019). https://doi.org/10.4230/LIPIcs.ICALP.2019.21
Bernardini, G., Pisanti, N., Pissis, S.P., Rosone, G.: Approximate pattern matching on elastic-degenerate text. Theor. Comput. Sci. 812, 109–122 (2020). https://doi.org/10.1016/j.tcs.2019.08.012
Borassi, M., Crescenzi, P., Habib, M.: Into the square: on the complexity of some quadratic-time solvable problems. Electron. Notes Theor. Comput. Sci. 322, 51–67 (2016). https://doi.org/10.1016/j.entcs.2016.03.005
Bringmann, K., Künnemann, M.: Quadratic conditional lower bounds for string problems and dynamic time warping. In: IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17–20 October 2015, pp. 79–97 (2015). https://doi.org/10.1109/FOCS.2015.15
Chen, L.: On the hardness of approximate and exact (bichromatic) maximum inner product. In: 33rd Computational Complexity Conference, CCC 2018, San Diego, CA, USA, 22–24 June 2018, pp. 14:1–14:45 (2018). https://doi.org/10.4230/LIPIcs.CCC.2018.14
Chen, L., Williams, R.: An equivalence class for orthogonal vectors. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, 6–9 January 2019, pp. 21–40 (2019). https://doi.org/10.1137/1.9781611975482.2
The computational pan-genomics consortium. Computational pan-genomics: status, promises and challenges. Brief. Bioinform. 19(1), 118–135 (2018). https://doi.org/10.1093/bib/bbw089
Equi, M., Grossi, R., Mäkinen, V., Tomescu, A.I.: On the complexity of string matching for graphs. In: 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, Patras, Greece, 9–12 July 2019, pp. 55:1–55:15 (2019). https://doi.org/10.4230/LIPIcs.ICALP.2019.55
Equi, M., Mkinen, V., Tomescu, A.I.: Graphs cannot be indexed in polynomial time for sub-quadratic time string matching, unless seth fails (2020). http://arxiv.org/abs/2002.00629
Gao, J., Impagliazzo, R.: Orthogonal vectors is hard for first-order properties on sparse graphs. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 23, p. 53 (2016). http://eccc.hpi-web.de/report/2016/053
Grossi, R., et al.: On-line pattern matching on similar texts. In: 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, Warsaw, Poland, 4–6 July 2017, pp. 9:1–9:14 (2017). https://doi.org/10.4230/LIPIcs.CPM.2017.9
Iliopoulos, C.S., Kundu, R., Pissis, S.P.: Efficient pattern matching in elastic-degenerate texts. In: Drewes, F., MartÃn-Vide, C., Truthe, B. (eds.) LATA 2017. LNCS, vol. 10168, pp. 131–142. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-53733-7_9
Impagliazzo, R., Paturi, R., Zane, F.: Which problems have strongly exponential complexity? J. Comput. Syst. Sci. 63(4), 512–530 (2001). https://doi.org/10.1006/jcss.2001.1774
Pissis, S.P.: MoTex-II: structured motif extraction from large-scale datasets. BMC Bioinform. 15, 235 (2014). https://doi.org/10.1186/1471-2105-15-235
Pissis, S.P., Retha, A.: Dictionary matching in elastic-degenerate texts with applications in searching VCF files on-line. In: 17th International Symposium on Experimental Algorithms, SEA 2018, L’Aquila, Italy, 27–29 June 2018, pp. 16:1–16:14 (2018). https://doi.org/10.4230/LIPIcs.SEA.2018.16
Polak, A.: Why is it hard to beat O(n\({}^{\text{2 }}\)) for longest common weakly increasing subsequence? Inf. Process. Lett. 132, 1–5 (2018). https://doi.org/10.1016/j.ipl.2017.11.007
Sagot, M.-F., Viari, A., Pothier, J., Soldano, H.: Finding flexible patterns in a text: an application to three-dimensional molecular matching. Comput. Appl. Biosci. 11(1), 59–70 (1995). https://doi.org/10.1093/bioinformatics/11.1.59
Sheikhizadeh, S., Schranz, M.E., Akdel, M., de Ridder, D., Smit, S.: Pantools: representation, storage and exploration of pan-genomic data. Bioinformatics 32(17), 487–493 (2016). https://doi.org/10.1093/bioinformatics/btw455
Weiner, P.: Linear pattern matching algorithms. In: 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, 15–17 October 1973, pp. 1–11. IEEE Computer Society (1973). https://doi.org/10.1109/SWAT.1973.13
Vassilevska Williams, V.: Hardness of easy problems: basing hardness on popular conjectures such as the strong exponential time hypothesis (invited talk). In: 10th International Symposium on Parameterized and Exact Computation, IPEC 2015, Patras, Greece, 16–18 September 2015, pp. 17–29 (2015). https://doi.org/10.4230/LIPIcs.IPEC.2015.17
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Gibney, D. (2020). An Efficient Elastic-Degenerate Text Index? Not Likely. In: Boucher, C., Thankachan, S.V. (eds) String Processing and Information Retrieval. SPIRE 2020. Lecture Notes in Computer Science(), vol 12303. Springer, Cham. https://doi.org/10.1007/978-3-030-59212-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-59212-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59211-0
Online ISBN: 978-3-030-59212-7
eBook Packages: Computer ScienceComputer Science (R0)