Minimum Common String Partition Problem: Hardness and Approximations

  • Avraham Goldstein
  • Petr Kolman
  • Jie Zheng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3341)

Abstract

String comparison is a fundamental problem in computer science, with applications in areas such as computational biology, text processing or compression. In this paper we address the minimum common string partition problem, a string comparison problem with tight connection to the problem of sorting by reversals with duplicates, a key problem in genome rearrangement.

A partition of a string A is a sequence \({\mathcal P}=(P_{1},P_{2},...P_{m})\) of strings, called the blocks, whose concatenation is equal to A. Given a partition \({\mathcal P}\) of a string A and a partition \({\mathcal Q}\) of a string B, we say that the pair \(\langle\mathcal{P,Q}\rangle\) is a common partition of A and B if \({\mathcal Q}\) is a permutation of \({\mathcal P}\). The minimum common string partition problem (MCSP) is to find a common partition of two strings A and B with the minimum number of blocks. The restricted version of MCSP where each letter occurs at most k times in each input string, is denoted by k-MCSP.

In this paper, we show that 2-MCSP (and therefore MCSP) is NP-hard and, moreover, even APX-hard. We describe a 1.1037-approximation for 2-MCSP and a linear time 4-approximation algorithm for 3-MCSP. We are not aware of any better approximations.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Avidor, A., Zwick, U.: Approximating MIN k-SAT. In: Bose, P., Morin, P. (eds.) ISAAC 2002. LNCS, vol. 2518, pp. 465–475. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Berman, P., Karpinski, M.: On some tighter inapproximability results. In: Proc. of the of 26th International Colloquium on Automata, Languages and Programming (ICALP), pp. 200–209 (1999)Google Scholar
  3. 3.
    Caprara, A.: Sorting by reversals is difficult. In: Proc. of the First International Conference on Computational Molecular Biology, pp. 75–83 (1997)Google Scholar
  4. 4.
    Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. Submitted (2004)Google Scholar
  5. 5.
    Christie, D.A., Irving, R.W.: Sorting strings by reversals and by transpositions. SIAM Journal on Discrete Mathematics 14(2), 193–206 (2001)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. In: Jansen, K., Khanna, S., Rolim, J.D.P., Ron, D. (eds.) RANDOM 2004 and APPROX 2004. LNCS, vol. 3122, pp. 84–95. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proc. of the 13th Annual ACM-SIAM Symposium On Discrete Mathematics (SODA), pp. 667–676 (2002)Google Scholar
  8. 8.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman & Company, San Francisco (1978)Google Scholar
  9. 9.
    Hannenhalli, S., Pevzner, P.A.: Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. Journal of the ACM 46(1), 1–27 (1999)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Sankoff, D., El-Mabrouk, N.: Genome rearrangement. In: Jiang, T., Xu, Y., Zhang, M.Q. (eds.) Current Topics in Computational Molecular Biology, MIT Press, Cambridge (2002)Google Scholar
  11. 11.
    Shapira, D., Storer, J.A.: Edit distance with move operations. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, p. 85. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Watterson, G.A., Ewens, W.J., Hall, T.E., Morgan, A.: The chromosome inversion problem. Journal of Theoretical Biology 99, 1–7 (1982)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Avraham Goldstein
    • 1
  • Petr Kolman
    • 2
  • Jie Zheng
    • 3
  1. 1.No Institute Given 
  2. 2.Institute for Theoretical Computer ScienceCharles UniversityPraha 1Czech Republic
  3. 3.Department of Computer ScienceUniversity of CaliforniaRiverside

Personalised recommendations