On the Approximation Ratio of the Group-Merge Algorithm for the Shortest Common Superstring Problem

  • Dirk Bongartz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1963)


The shortest common superstring problem (SCS) is one of the fundamental optimization problems in the area of data compression and DNA sequencing. The SCS is known to be APX-hard [1]. This paper focuses on the analysis of the approximation ratio of two greedy-based approximation algorithms for it, namely the naive Greedy algorithm and the Group-Merge algorithm. The main results of this paper are: (i) We disprove the claim that the input instances of Jiang and Li [4] prove that the Group-Merge algorithm does not provide any constant approximation for the SCS. We even prove that the Group-Merge algorithm always finds optimal solutions for these instances. (ii) We show that the Greedy algorithm and the Group-Merge algorithm are incomparable according to the approximation ratio. (iii) We attack the main problem whether the Group-Merge algorithm has a constant approximation ratio by showing that this is the case for a slightly modified algorithm denoted as Group-Merge-1 if all strings have approximately the same length and the compression is limited by a constant fraction of the trivial solution.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blum, A., Jiang, T., Li, M., Tromp, J., Yannakakis, M.: Linear approximation for shortest superstrings. In: Journal of the ACM 41(4), pp. 630–647, July 1994. 298, 299, 300, 303zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Gallant, J., Maier, D., Storer, J. A.: On Finding Minimal Length superstrings. In: Journal of Computer and System Sciences 20, pp. 50–58, 1980. 298, 303zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Li, M.: Toward a DNA sequencing theory. In: Proc. 31st IEEE Symp. on Foundation of Computer Science, pp. 125–134, 1990. 298, 300, 303Google Scholar
  4. 4.
    Jiang, T., Li, M.: DNA Sequencing and String Learning. In: Mathematical Systems Theory 29, pp. 387–405, 1996. 298, 299, 300, 301, 302zbMATHMathSciNetGoogle Scholar
  5. 5.
    Sweedyk, Z.: A 2 1/2-approximation algorithm for shortest superstring. In: SIAM Journal on Computing 29 (3), pp. 954–86, 1999. 298CrossRefMathSciNetGoogle Scholar
  6. 6.
    Tarhio, J., Ukkonen, E.: A greedy approximation algorithm for constructing shortest common superstrings. In: Theoretical Computer Science 57, pp. 131–145, 1988. 300, 303, 305zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Dirk Bongartz
    • 1
  1. 1.Lehrstuhl für Informatik IRWTH AachenAachenGermany

Personalised recommendations