On the Closest String via Rank Distance
Given a set S of k strings of maximum length n, the goal of the closest substring problem (CSSP) is to find the smallest integer d (and a corresponding string t of length ℓ ≤ n) such that each string s ∈ S has a substring of length ℓ of “distance” at most d to t. The closest string problem (CSP) is a special case of CSSP where ℓ = n. CSP and CSSP arise in many applications in bioinformatics and are extensively studied in the context of Hamming and edit distance. In this paper we consider a recently introduced distance measure, namely the rank distance. First, we show that the CSP and CSSP via rank distance are NP-hard. Then, we present a polynomial time k-approximation algorithm for the CSP problem. Finally, we give a parametrized algorithm for the CSP (the parameter is the number of input strings) if the alphabet is binary and each string has the same number of 0’s and 1’s.
KeywordsClose String Edit Distance Complete Bipartite Graph Input String Rank Aggregation
Unable to display preview. Download preview PDF.
- 1.Arrow, K.J.: Social Choice and Indivudual Values. Wiley, New York (1963)Google Scholar
- 11.Gramm, J., Huffner, F., Niedermeier, R.: Closest strings, primer design, and motif search. currents in computational molecular biology. In: RECOMB, pp. 74–75 (2002)Google Scholar
- 19.Schwarz, N.: Rank aggregation by criteria. Minimizing the maximum Kendall-tau distance. Diplomarbeit, Jena (2009)Google Scholar
- 21.Wooley, J.C.: Trends in computational biology: A summary based on a recomb plenary lecture. J. Comp. Biol. 6(3/4) (1999)Google Scholar