Skip to main content

An Efficient Elastic-Degenerate Text Index? Not Likely

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12303))

Included in the following conference series:

Abstract

Elastic-degenerate text provides a novel and effective method for modeling collections of text that have local variations. Due to its applicability in pan-genomics, an index for an elastic-degenerate text which can efficiently report the occurrences of a given query pattern is desirable. This paper attempts to dash our hopes for such an index, one that is deterministic and has good worst-case query time. We do so by providing conditional lower bounds based on the Orthogonal Vectors Hypothesis (OVH) (and hence the Strong Exponential Time Hypothesis). We show that, even with arbitrary polynomial preprocessing time, an index for an elastic-degenerate text with n degenerate letters that can perform queries on a pattern of length m in time for constants \(\alpha \) and \(\beta \) where or would violate OVH. Additionally, we provide an elastic-degenerate text index with query time , which is independent of the size N (distinct from its length) of the elastic-degenerate text. Finally, we investigate the hardness of matching elastic-degenerate text to elastic-degenerate text.

Supported in part by the U.S. National Science Foundation (NSF) under CCF-1703489.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abboud, A., Backurs, A., Hansen, T.D., Williams, V.V., Zamir, O.: Subtree isomorphism revisited. ACM Trans. Algorithms 14(3), 27:1–27:23 (2018). https://doi.org/10.1145/3093239

    Article  MathSciNet  MATH  Google Scholar 

  2. Abboud, A., Backurs, A., Williams, V.V.: Tight hardness results for LCS and other sequence similarity measures. In: IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17–20 October 2015, pp. 59–78 (2015). https://doi.org/10.1109/FOCS.2015.14

  3. Abboud, A., Bringmann, K., Dell, H., Nederlof, J.: More consequences of falsifying SETH and the orthogonal vectors conjecture. In: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, 25–29 June 2018, pp. 253–266 (2018). https://doi.org/10.1145/3188745.3188938

  4. Alzamel, M., et al.: Degenerate string comparison and applications. In: 18th International Workshop on Algorithms in Bioinformatics, WABI 2018, 20–22 August 2018, Helsinki, Finland, pp. 21:1–21:14 (2018). https://doi.org/10.4230/LIPIcs.WABI.2018.21

  5. Aoyama, K., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Faster online elastic degenerate string matching. In: Annual Symposium on Combinatorial Pattern Matching, CPM 2018, Qingdao, China 2–4 July 2018, pp. 9:1–9:10 (2018). https://doi.org/10.4230/LIPIcs.CPM.2018.9

  6. Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput. 47(3), 1087–1097 (2018). https://doi.org/10.1137/15M1053128

    Article  MathSciNet  MATH  Google Scholar 

  7. Bernardini, G., Gawrychowski, P., Pisanti, N., Pissis, S.P., Rosone, G.: Even faster elastic-degenerate string matching via fast matrix multiplication. In: 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, Patras, Greece, 9–12 July 2019, pp. 21:1–21:15 (2019). https://doi.org/10.4230/LIPIcs.ICALP.2019.21

  8. Bernardini, G., Pisanti, N., Pissis, S.P., Rosone, G.: Approximate pattern matching on elastic-degenerate text. Theor. Comput. Sci. 812, 109–122 (2020). https://doi.org/10.1016/j.tcs.2019.08.012

    Article  MathSciNet  MATH  Google Scholar 

  9. Borassi, M., Crescenzi, P., Habib, M.: Into the square: on the complexity of some quadratic-time solvable problems. Electron. Notes Theor. Comput. Sci. 322, 51–67 (2016). https://doi.org/10.1016/j.entcs.2016.03.005

    Article  MathSciNet  MATH  Google Scholar 

  10. Bringmann, K., Künnemann, M.: Quadratic conditional lower bounds for string problems and dynamic time warping. In: IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17–20 October 2015, pp. 79–97 (2015). https://doi.org/10.1109/FOCS.2015.15

  11. Chen, L.: On the hardness of approximate and exact (bichromatic) maximum inner product. In: 33rd Computational Complexity Conference, CCC 2018, San Diego, CA, USA, 22–24 June 2018, pp. 14:1–14:45 (2018). https://doi.org/10.4230/LIPIcs.CCC.2018.14

  12. Chen, L., Williams, R.: An equivalence class for orthogonal vectors. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, 6–9 January 2019, pp. 21–40 (2019). https://doi.org/10.1137/1.9781611975482.2

  13. The computational pan-genomics consortium. Computational pan-genomics: status, promises and challenges. Brief. Bioinform. 19(1), 118–135 (2018). https://doi.org/10.1093/bib/bbw089

  14. Equi, M., Grossi, R., Mäkinen, V., Tomescu, A.I.: On the complexity of string matching for graphs. In: 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, Patras, Greece, 9–12 July 2019, pp. 55:1–55:15 (2019). https://doi.org/10.4230/LIPIcs.ICALP.2019.55

  15. Equi, M., Mkinen, V., Tomescu, A.I.: Graphs cannot be indexed in polynomial time for sub-quadratic time string matching, unless seth fails (2020). http://arxiv.org/abs/2002.00629

  16. Gao, J., Impagliazzo, R.: Orthogonal vectors is hard for first-order properties on sparse graphs. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 23, p. 53 (2016). http://eccc.hpi-web.de/report/2016/053

  17. Grossi, R., et al.: On-line pattern matching on similar texts. In: 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, Warsaw, Poland, 4–6 July 2017, pp. 9:1–9:14 (2017). https://doi.org/10.4230/LIPIcs.CPM.2017.9

  18. Iliopoulos, C.S., Kundu, R., Pissis, S.P.: Efficient pattern matching in elastic-degenerate texts. In: Drewes, F., Martín-Vide, C., Truthe, B. (eds.) LATA 2017. LNCS, vol. 10168, pp. 131–142. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-53733-7_9

    Chapter  Google Scholar 

  19. Impagliazzo, R., Paturi, R., Zane, F.: Which problems have strongly exponential complexity? J. Comput. Syst. Sci. 63(4), 512–530 (2001). https://doi.org/10.1006/jcss.2001.1774

    Article  MathSciNet  MATH  Google Scholar 

  20. Pissis, S.P.: MoTex-II: structured motif extraction from large-scale datasets. BMC Bioinform. 15, 235 (2014). https://doi.org/10.1186/1471-2105-15-235

    Article  Google Scholar 

  21. Pissis, S.P., Retha, A.: Dictionary matching in elastic-degenerate texts with applications in searching VCF files on-line. In: 17th International Symposium on Experimental Algorithms, SEA 2018, L’Aquila, Italy, 27–29 June 2018, pp. 16:1–16:14 (2018). https://doi.org/10.4230/LIPIcs.SEA.2018.16

  22. Polak, A.: Why is it hard to beat O(n\({}^{\text{2 }}\)) for longest common weakly increasing subsequence? Inf. Process. Lett. 132, 1–5 (2018). https://doi.org/10.1016/j.ipl.2017.11.007

    Article  MathSciNet  MATH  Google Scholar 

  23. Sagot, M.-F., Viari, A., Pothier, J., Soldano, H.: Finding flexible patterns in a text: an application to three-dimensional molecular matching. Comput. Appl. Biosci. 11(1), 59–70 (1995). https://doi.org/10.1093/bioinformatics/11.1.59

    Article  Google Scholar 

  24. Sheikhizadeh, S., Schranz, M.E., Akdel, M., de Ridder, D., Smit, S.: Pantools: representation, storage and exploration of pan-genomic data. Bioinformatics 32(17), 487–493 (2016). https://doi.org/10.1093/bioinformatics/btw455

    Article  Google Scholar 

  25. Weiner, P.: Linear pattern matching algorithms. In: 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, 15–17 October 1973, pp. 1–11. IEEE Computer Society (1973). https://doi.org/10.1109/SWAT.1973.13

  26. Vassilevska Williams, V.: Hardness of easy problems: basing hardness on popular conjectures such as the strong exponential time hypothesis (invited talk). In: 10th International Symposium on Parameterized and Exact Computation, IPEC 2015, Patras, Greece, 16–18 September 2015, pp. 17–29 (2015). https://doi.org/10.4230/LIPIcs.IPEC.2015.17

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Gibney .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gibney, D. (2020). An Efficient Elastic-Degenerate Text Index? Not Likely. In: Boucher, C., Thankachan, S.V. (eds) String Processing and Information Retrieval. SPIRE 2020. Lecture Notes in Computer Science(), vol 12303. Springer, Cham. https://doi.org/10.1007/978-3-030-59212-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59212-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59211-0

  • Online ISBN: 978-3-030-59212-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics