Abstract
In this paper, we consider a commonly used compression scheme called run-length encoding (abbreviated rle). We provide lower bounds for problems of approximately matching two rle strings. Specifically, we show that the wildcard matching and k-mismatches problems for rle strings are 3sum-hard. For two rle strings of m and n runs, such a result implies that it is very unlikely to devise an o(mn)-time algorithm for either problem. We then propose an O(mn + plogm)-time sweep-line algorithm for their combined problem, i.e. wildcard matching with mismatches, where p ≤ mn is the number of matched or mismatched runs. Furthermore, the problem of aligning two rle strings is also shown to be 3sum-hard.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abrahamson, K.: Generalized String Matching. SIAM Journal on Computing 16(6), 1039–1051 (1987)
Amir, A., Benson, G.: Efficient Two-Dimensional Compressed Matching. In: DCC, pp. 279–288 (1992)
Amir, A., Lewenstein, M., Porat, E.: Faster Algorithms for String Matching with k Mismatches. Journal of Algorithms 50(2), 257–275 (2004)
Amir, A., Landau, G.M., Sokol, D.: Inplace Run-Length 2d Compressed Search. Theoretical Computer Science 290(3), 1361–1383 (2003)
Apostolico, A., Landau, G.M., Skiena, S.: Matching for Run-Length Encoded Strings. Journal of Complexity 15(1), 4–16 (1999)
Arbell, O., Landau, G.M., Mitchell, J.S.B.: Edit Distance of Run-Length Encoded Strings. Information Processing Letters 83(6), 307–314 (2002)
Baran, I., Demaine, E.D., Patrascu, M.: Subquadratic Algorithms for 3SUM. Algorithmica 50(4), 584–596 (2008)
Barequet, G., Har-Peled, S.: Polygon Containment and Translational Min-Hausdorff-Distance Between Segment Sets are 3sum-Hard. International Journal of Computational Geometry and Applications 11(4), 465–474 (2001)
Berghorn, W., Boskamp, T., Lang, M., Peitgen, H.-O.: Fast Variable Run-Length Coding for Embedded Progressive Wavelet-Based Image Compression. IEEE Transactions on Image Processing 10(12), 1781–1790 (2001)
Bunke, H., Csirik, J.: An Improved Algorithm for Computing the Edit Distance of Run-Length Coded Strings. Information Processing Letters 54(2), 93–96 (1995)
Clifford, P., Clifford, R.: Simple Deterministic Wildcard Matching. Information Processing Letters 101(2), 53–54 (2007)
Clifford, R., Efremenko, K., Porat, E., Rothschild, A.: From Coding Theory to Efficient Pattern Matching. In: SODA (2009)
Cole, R., Hariharan, R.: Verifying Candidate Matches in Sparse and Wildcard Matching. In: STOC, pp. 592–601 (2002)
Crochemore, M., Landau, G.M., Ziv-Ukelson, M.: A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices. SIAM Journal on Computing 32(6), 1654–1673 (2003)
Gajentaan, A., Overmars, M.H.: On a Class of O(n 2) Problems in Computational Geometry. Computational Geometry 5, 165–185 (1995)
Huang, G.-S., Liu, J.J., Wang, Y.-L.: Sequence Alignment Algorithms for Run-Length-Encoded Strings. In: Hu, X., Wang, J. (eds.) COCOON 2008. LNCS, vol. 5092, pp. 319–330. Springer, Heidelberg (2008)
Kim, J.W., Amir, A., Landau, G.M., Park, K.: Computing Similarity of Run-Length Encoded Strings with Affine Gap Penalty. Theoretical Computer Science 395(2–3), 268–282 (2008)
Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast Pattern Matching in Strings. SIAM Journal on Computing 6(2), 323–350 (1977)
Kosaraju, S.R.: Efficient String Matching (manuscript, 1987)
Liu, J.J., Huang, G.-S., Wang, Y.-L., Lee, R.C.T.: Edit Distance for a Run-Length-Encoded String and an Uncompressed String. Information Processing Letters 105(1), 12–16 (2007)
Liu, J.J., Wang, Y.-L., Lee, R.C.T.: Finding a Longest Common Subsequence between a Run-Length-Encoded String and an Uncompressed String. Journal of Complexity 24(2), 173–184 (2008)
Mäkinen, V., Ukkonen, E., Navarro, G.: Approximate Matching of Run-Length Compressed Strings. Algorithmica 35(4), 347–369 (2003)
Bodson, D., McConnell, K.R., Schaphorst, R.: FAX: Digital Facsimile Technology and Applications. Artech House, Norwood (1989)
Mitchell, J.S.B.: A Geometric Shortest Path Problem, with Application to Computing a Longest Common Subsequence in Run-Length Encoded Strings. Technical Report, SUNY Stony Brook (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, KY., Hsu, PH., Chao, KM. (2009). Approximate Matching for Run-Length Encoded Strings Is 3sum-Hard. In: Kucherov, G., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 2009. Lecture Notes in Computer Science, vol 5577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02441-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-02441-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02440-5
Online ISBN: 978-3-642-02441-2
eBook Packages: Computer ScienceComputer Science (R0)