Skip to main content

Approximated Pattern Matching with the L 1, L 2 and L  ∞  Metrics

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5280))

Abstract

Given an alphabet Σ = {1,2,...,|Σ|} text string T ∈ Σ n and a pattern string P ∈ Σ m, for each i = 1,2,...,n − m + 1 define L d (i) as the d-norm distance when the pattern is aligned below the text and starts at position i of the text. The problem of pattern matching with L p distance is to compute L p (i) for every i = 1,2,...,n − m + 1. We discuss the problem for d = 1, ∞. First, in the case of L 1 matching (pattern matching with an L 1 distance) we present an algorithm that approximates the L 1 matching up to a factor of 1 + ε, which has an \(O(\frac{1}{\varepsilon^2} n\log mlog |\Sigma|)\) run time. Second, we provide an algorithm that approximates the L  ∞  matching up to a factor of 1 + ε with a run time of \(O(\frac{1}{\varepsilon} n\log mlog |\Sigma|)\). We also generalize the problem of String Matching with mismatches to have weighted mismatches and present an O(nlog4 m) algorithm that approximates the results of this problem up to a factor of O(logm) in the case that the weight function is a metric.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abrahamson, K.: Generalized string matching. SIAM J. Computing 16(6), 1039–1051 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  2. Amir, A., Farach, M.: Efficient 2-dimensional approximate matching of half-rectangular figures. Information and Computation 118(1), 1–11 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  3. Amir, A., Farach, M., Muthukrishnan, S.: Alphabet dependence in parameterized matching. Information Processing Letters 49, 111–115 (1994)

    Article  MATH  Google Scholar 

  4. Amir, A., Lipsky, O., Porat, E., Umanski, J.: Approximate matching in the l\(_{\mbox{1}}\) metric. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005, vol. 3537, pp. 91–103. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 592–601. ACM Press, New York (2002)

    Google Scholar 

  6. Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press and McGraw-Hill (1992)

    Google Scholar 

  7. Kam Wing Chu, K., Hon Wong, M.: Fast time-series searching with scaling and shifting. In: Symposium on Principles of Database Systems, pp. 237–248 (1999)

    Google Scholar 

  8. Fischer, M.J., Paterson, M.S.: String matching and other products. In: Karp, R.M. (ed.) Complexity of Computation, SIAM-AMS Proceedings, vol. 7, pp. 113–125 (1974)

    Google Scholar 

  9. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proceedings 1994 ACM SIGMOD Conference, Mineapolis, MN, pp. 419–429 (1994)

    Google Scholar 

  10. Indyk, P., Koudas, N., Muthukrishnan, S.: Identifying representative trends in massive time series data sets using sketches. In: International Conference on Very Large Data Bases (VLDB), pp. 363–372 (2000)

    Google Scholar 

  11. Indyk, P., Lewenstein, M., Lipsky, O., Porat, E.: Closest pair problems in very high dimensions. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004, vol. 3142, pp. 782–792. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  12. Indyk, P.: Stable distributions, pseudorandom generators, embeddings and data stream computation. In: Foundations of Computer Science (FOCS), pp. 189–197 (2000)

    Google Scholar 

  13. Indyk, P.: Algorithmic applications of low-distortion geometric embeddings. In: Foundations of Computer Science, FOCS (2001)

    Google Scholar 

  14. Karloff, H.: Fast algorithms for approximately counting mismatches. Information Processing Letters 48(2), 53–60 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  15. Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Computing 6, 323–350 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  16. Lipsky, O., Porat, E.: Efficient l1-matching algorithm (manuscript, 2001)

    Google Scholar 

  17. Lipsky, O., Porat, E.: L\(_{\mbox{1}}\) pattern matching lower bound. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005, vol. 3772, pp. 327–330. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  18. Muthukrishnan, S., Ramesh, H.: String matching under a general matching relation. Information and Computation 122(1), 140–148 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  19. Muthukrishnan, S.: New results and open problems related to non-standard stringology. In: Combinatorial Pattern Matching Conference, pp. 298–317 (1995)

    Google Scholar 

  20. Perttu, S.: Combinatorial pattern matching in musical sequences. Master’s thesis, Department of Computer Science, University of Helsinki (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lipsky, O., Porat, E. (2008). Approximated Pattern Matching with the L 1, L 2 and L  ∞  Metrics. In: Amir, A., Turpin, A., Moffat, A. (eds) String Processing and Information Retrieval. SPIRE 2008. Lecture Notes in Computer Science, vol 5280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89097-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89097-3_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89096-6

  • Online ISBN: 978-3-540-89097-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics