Computing Maximum-Scoring Segments in Almost Linear Time
Given a sequence, the problem studied in this paper is to find a set of k disjoint continuous subsequences such that the total sum of all elements in the set is maximized. This problem arises naturally in the analysis of DNA sequences. The previous best known algorithm requires Θ(n log n) time in the worst case. For a given sequence of length n, we present an almost linear-time algorithm for this problem. Our algorithm uses a disjoint-set data structure and requires O(nα(n, n)) time in the worst case, where α(n, n) is the inverse Ackermann function.
Unable to display preview. Download preview PDF.
- 3.Huang, X.: An algorithm for identifying regions of a DNA sequence that satisfy a content requirement. Computer Applications in the Biosciences 10, 219–225 (1994)Google Scholar
- 9.Bae, S.E., Takaoka, T.: Algorithms for the problem of k maximum sums and a VLSI algorithm for the k maximum subarrays problem. In: Proceedings of the 7th International Symposium on Parallel Architectures, Algorithms and Networks, pp. 247–253 (2004)Google Scholar
- 15.Ruzzo, W.L., Tompa, M.: A linear time algorithm for finding all maximal scoring subsequences. In: Proceedings of the 7th Annual International Conference on Intelligent Systems for Molecular Biology, pp. 234–241 (1999)Google Scholar