Skip to main content

Online Pattern Matching for String Edit Distance with Moves

  • Conference paper
String Processing and Information Retrieval (SPIRE 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8799))

Included in the following conference series:

Abstract

Edit distance with moves (EDM) is a string-to-string distance measure that includes substring moves in addition to ordinal editing operations to turn one string to the other. Although optimizing EDM is intractable, it has many applications especially in error detections. Edit sensitive parsing (ESP) is an efficient parsing algorithm that guarantees an upper bound of parsing discrepancies between different appearances of the same substrings in a string. ESP can be used for computing an approximate EDM as the L 1 distance between characteristic vectors built by node labels in parsing trees. However, ESP is not applicable to a streaming text data where a whole text is unknown in advance. We present an online ESP (OESP) that enables an online pattern matching for EDM. OESP builds a parse tree for a streaming text and computes the L 1 distance between characteristic vectors in an online manner. For the space-efficient computation of EDM, OESP directly encodes the parse tree into a succinct representation by leveraging the idea behind recent results of a dynamic succinct tree. We experimentally test OESP on the ability to compute EDM in an online manner on benchmark datasets, and we show OESP’s efficiency.

This work was supported by JSPS KAKENHI(24700140,26280088) and the JST PRESTO program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bafna, V., Pevzner, P.A.: Genome rearrangements and sorting by reversals. SIAM Jour. on Comp. 25, 272–289 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  2. Clifford, R., Sach, B.: Pattern matching in pseudo real-time. JDA 9, 67–81 (2011)

    MathSciNet  MATH  Google Scholar 

  3. Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. TALG 3, 2:1–2:19 (2007)

    Google Scholar 

  4. Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press (1994)

    Google Scholar 

  5. Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press (1998)

    Google Scholar 

  6. Jacobson, G.: Space-efficient static trees and graphs. In: Proc. of FOCS, pp. 549–554 (1989)

    Google Scholar 

  7. Jalsenius, M., Porat, B., Sach, B.: Parameterized matching in the streaming model. In: STACS, pp. 400–411 (2013)

    Google Scholar 

  8. Kececioglu, J., Sankoff, D.: Exact and approximation algorithms for the inversion distance between two chromosomes. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds.) CPM 1993. LNCS, vol. 684, pp. 87–105. Springer, Heidelberg (1993)

    Chapter  Google Scholar 

  9. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10, 707–710 (1996)

    MathSciNet  Google Scholar 

  10. Maruyama, S., Tabei, Y.: Fully-online grammar compression in constant space. In: Proc. of DCC, pp. 218–229 (2014)

    Google Scholar 

  11. Maruyama, S., Tabei, Y., Sakamoto, H., Sadakane, K.: Fully-online grammar compression. In: Kurland, O., Lewenstein, M., Porat, E. (eds.) SPIRE 2013. LNCS, vol. 8214, pp. 218–229. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  12. Muthukrishnan, S., Sahinalp, S.C.: Approximate nearest neighbors and sequence comparison with block operations. In: Proc. of STOC, pp. 416–424 (2000)

    Google Scholar 

  13. Navarro, G., Providel, E.: Fast, small, simple rank/select on bitmaps. In: Klasing, R. (ed.) SEA 2012. LNCS, vol. 7276, pp. 295–306. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  14. Navarro, G., Sadakane, K.: Fully-functional static and dynamic succinct trees. TALG (2012) (accepted); A preliminary version appeared in SODA 2010 (2010)

    Google Scholar 

  15. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comp. Sci. 302(1-3), 211–222 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  16. Shapira, D., Storer, J.A.: Edit distance with move operations. JDA 5, 380–392 (2007)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Takabatake, Y., Tabei, Y., Sakamoto, H. (2014). Online Pattern Matching for String Edit Distance with Moves. In: Moura, E., Crochemore, M. (eds) String Processing and Information Retrieval. SPIRE 2014. Lecture Notes in Computer Science, vol 8799. Springer, Cham. https://doi.org/10.1007/978-3-319-11918-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11918-2_20

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11917-5

  • Online ISBN: 978-3-319-11918-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics