On the Structure of Small Motif Recognition Instances
Given a set of sequences, S, and degeneracy parameter, d, the Consensus Sequence problem asks whether there exists a sequence that has Hamming distance at most d from each sequence in S. A valid motif set is a set of sequences for which such a consensus sequence exists, while a decoy set is a set of sequences that does not have a consensus sequence but whose pairwise Hamming distances are all at most 2d. At present, no efficient solution is known to the Consensus Sequence problem when the number of sequences is greater than three. For instances of Consensus Sequence with binary sequences and cardinality four, we present a combinatorial characterization of decoy sets and a linear-time exact algorithm, resolving an open problem posed by Gramm et al. .
KeywordsConsensus Sequence Integer Linear Program Binary Sequence Optimal Sequence Motif Recognition
Unable to display preview. Download preview PDF.
- 10.Lanctot, J.K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string selection problems. In: Proc. SODA 1999, pp. 633–642 (1999)Google Scholar
- 13.Pevzner, P., Sze, S.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proc. ISMB 2000, pp. 344–354 (2000)Google Scholar