Abstract
Approximate matching is one of the fundamental problems in pattern matching, and a ubiquitous problem in real applications. The Hamming distance is a simple and well studied example of approximate matching, motivated by typing, or noisy channels. Biological and image processing applications assign a different value to mismatches of different symbols.
We consider the problem of approximate matching in the L 1 metric – the k- L 1 -distance problem. Given text T=t 0,...,t n − 1 and pattern P=p 0,...,p m − 1 strings of natural number, and a natural number k, we seek all text locations i where the L 1 distance of the pattern from the length m substring of text starting at i is not greater than k, i.e. \(\sum_{j=0}^{m-1} |{t}_{i+j} - {p}_{j}| \leq k\).
We provide an algorithm that solves the k-L 1-distance problem in time \(O(n\sqrt{k\log k})\). The algorithm applies a bounded divide-and-conquer approach and makes novel uses of non-boolean convolutions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abrahamson, K.: Generalized string matching. SIAM J. Comp. 16(6), 1039–1051 (1987)
Amir, A., Aumann, A., Cole, R., Lewenstein, M., Porat, E.: Function matching: Algorithms, applications, and a lower bound. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 929–942. Springer, Heidelberg (2003)
Amir, A., Cole, R., Hariharan, R., Lewenstein, M., Porat, E.: Overlap matching. Information and Computation 181(1), 57–74 (2003)
Amir, A., Eisenberg, E., Porat, E.: Swap and mismatch edit distance. In: Albers, S., Radzik, T. (eds.) ESA 2004. LNCS, vol. 3221, pp. 16–27. Springer, Heidelberg (2004)
Amir, A., Farach, M.: Efficient 2-dimensional approximate matching of halfrectangular figures. Information and Computation 118(1), 1–11 (1995)
Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Algorithms (2004)
Berkman, O., Breslauer, D., Galil, Z., Schieber, B., Vishkin, U.: Highly parallelizable problems. In: Proc. 21st ACM Symposium on Theory of Computation, pp. 309–319 (1989)
Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Proc. 34st Annual Symposium on the Theory of Computing (STOC), pp. 592–601 (2002)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proc. 13th annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 667–676. Society for Industrial and Applied Mathematics (2002)
Fischer, M.J., Paterson, M.S.: String matching and other products. In: Karp, R.M. (ed.) Complexity of Computation. SIAM-AMS Proceedings, vol. 7, pp. 113–125 (1974)
Galil, Z.: Open problems in stringology. In: Galil, Z., Apostolico, A. (eds.) Combinatorial Algorithms on Words. NATO ASI Series F, vol. 12, pp. 1–8 (1985)
Galil, Z., Giancarlo, R.: Improved string matching with k mismatches. SIGACT News 17(4), 52–54 (1986)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestor. Computer and System Science 13, 338–355 (1984)
Karloff, H.: Fast algorithms for approximately counting mismatches. Information Processing Letters 48(2), 53–60 (1993)
Landau, G.M., Vishkin, U.: Efficient string matching with k mismatches. Theoretical Computer Science 43, 239–249 (1986)
Lipsky, O.: Efficient distance computations. Master’s thesis, Bar-Ilan University, Department of Computer Science, Ramat-Gan 52900, ISRAEL (2003)
Maasoumi, E., Racine, J.: Entropy and predictability of stock market returns. Journal of Econometrics 107(1), 291–312 (2002), available at http://ideas.repec.org/a/eee/econom/v107y2002i1-2p291-312.html
Malagnini, L., Herman, R.B., Di Bona, M.: Ground motion scaling in the apennines (italy). Bull. Seism. Soc. Am. 90, 1062–1081 (2000)
McCreight, E.M.: A space-economical suffix tree construction algorithm. J. of the ACM 23, 262–272 (1976)
Olson, M.V.: A time to sequence. Science 270, 394–396 (1995)
Pentland, A.: Invited talk. In: NSF Institutional Infrastructure Workshop (1992)
Shmulevich, I., Yli-Harja, O., Coyle, E., Povel, D., Lemstrom, K.: Perceptual issues in music pattern recognition — complexity of rhythm and key finding (April 1999)
Weiner, P.: Linear pattern matching algorithm. In: Proc. 14 IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Amir, A., Lipsky, O., Porat, E., Umanski, J. (2005). Approximate Matching in the L 1 Metric. In: Apostolico, A., Crochemore, M., Park, K. (eds) Combinatorial Pattern Matching. CPM 2005. Lecture Notes in Computer Science, vol 3537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11496656_9
Download citation
DOI: https://doi.org/10.1007/11496656_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26201-5
Online ISBN: 978-3-540-31562-9
eBook Packages: Computer ScienceComputer Science (R0)