Abstract
In the minimum common string partition problem one is given two strings S and T with the same character statistics and one seeks the smallest partition of S into substrings so that T can also be partitioned into the same substring multiset. The problem is fundamental in several variants of edit distance with block operations, e.g. signed reversal distance with duplicates and edit distance with moves.
The minimum common string partition problem is known to be NP-complete and the best approximation known is of order O(lognlog* n). Since this problem is of utmost practical importance one seeks a heuristic that will (1) usually have a low approximation factor and (2) will run fast.
A simple greedy algorithm is known and it has been well-studied from an approximation point of view. It has been shown to have a bad worst case approximation factor. However, all the bad approximation factors presented so far stem from complicated recursive construction. In practice the greedy algorithm seems to have small approximation factors. However, the best current implementation of greedy runs in quadratic time.
We propose a novel method to implement greedy in linear time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. Comput. Biology Bioinform. 2(4), 302–315 (2005)
Christie, D.A., Irving, R.W.: Sorting strings by reversals and by transpositions. SIAM J. Discrete Math. 14(2), 193–206 (2001)
Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. ACM Transactions on Algorithms 1(2), 350–366 (2005)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proceedings of the Annual Symposium on Discrete Algorithms (SODA), pp. 667–676 (2002)
Farach, M.: Optimal suffix tree construction with large alphabets. In: FOCS 1997: Proceedings of the 38th Annual Symposium on Foundations of Computer Science, Washington, DC, USA, p. 137. IEEE Computer Society, Los Alamitos (1997)
Goldstein, A., Kolman, P., Zheng, J.: Minimum common string partition problem: Hardness and approximations. Electr. J. Comb. 12(1) (2005)
Hannenhalli, S., Pevzner, P.A.: Transforming cabbage into turnip: Polynomial algorithm for sorting signed permutations by reversals. J. ACM 46(1), 1–27 (1999)
Jiang, H., Zhu, B., Zhu, D., Zhu, H.: Minimum common string partition revisited. In: Lee, D.-T., Chen, D.Z., Ying, S. (eds.) FAW 2010. LNCS, vol. 6213, pp. 45–52. Springer, Heidelberg (2010)
Kaplan, H., Shafrir, N.: The greedy algorithm for edit distance with moves. Inf. Process. Lett. 97(1), 23–27 (2006)
Kolman, P.: Approximating reversal distance for strings with bounded number of duplicates. In: Jedrzejowicz, J., Szepietowski, A. (eds.) MFCS 2005. LNCS, vol. 3618, pp. 580–590. Springer, Heidelberg (2005)
Kolman, P., Walen, T.: Reversal distance for strings with duplicates: Linear time approximation using hitting set. Electr. J. Comb. 14(1) (2007)
Kruskal, J., Sankoff, D.: Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, Reading (1999)
Lopresti, D.P., Tomkins, A.: Block edit models for approximate string matching. Theor. Comput. Sci. 181(1), 159–179 (1997)
McCreight, E.M.: A space-economical suffix tree construction algorithm. J. ACM 23(2), 262–272 (1976)
Shapira, D., Storer, J.A.: Edit distance with move operations. J. Discrete Algorithms 5(2), 380–392 (2007)
Tichy, W.F.: The string-to-string correction problem with block moves. ACM Trans. Comput. Syst. 2(4), 309–321 (1984)
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Weiner, P.: Linear pattern matching algorithms. In: 14th Annual Symposium on Switching and Automata Theory, pp. 1–11. IEEE, Los Alamitos (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Goldstein, I., Lewenstein, M. (2011). Quick Greedy Computation for Minimum Common String Partitions. In: Giancarlo, R., Manzini, G. (eds) Combinatorial Pattern Matching. CPM 2011. Lecture Notes in Computer Science, vol 6661. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21458-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-21458-5_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21457-8
Online ISBN: 978-3-642-21458-5
eBook Packages: Computer ScienceComputer Science (R0)