A space efficient algorithm for finding the best non-overlapping alignment score
Repeating patterns make up a significant fraction of DNA and protein molecules. These repeating regions are important to biological function because they may act as catalytic, regulatory or evolutionary sites and because they have been implicated in human disease. Additionally, these regions often serve as useful laboratory tools for such tasks as localizing genes on a chromosome and DNA fingerprinting. In this paper, we present a space efficient algorithm for finding the maximum alignment score for any two substrings of a single string T under the condition that the substrings do not overlap. In a biological context, this corresponds to the largest repeating region in the molecule. The algorithm runs in O(n2 log2n) time and uses only O(n2) space.
Unable to display preview. Download preview PDF.
- 1.A. Apostolico, M.J. Atallah, L.L. Larmore, and S. Mcfaddin, “Efficient Parallel Algorithms for String Editing and Related Problems,” SIAM J. Comput., 19, pp 968–988, 1990.Google Scholar
- 2.S. Kannan and E. Myers, “An Algorithm for Locating Non-overlapping Regions of Maximum Alignment Score,” Fourth Annual Symposium on Combinatorial Pattern Matching, pp 74–86, 1993.Google Scholar
- 3.Z.M. Kedem and H. Fuchs, “On finding several shortest paths in certain graphs,” Proc. 18th Allerton Conference on Communication, Control and Computing, pp 677–683, October 1980.Google Scholar
- 4.G. Landau and J. Schmidt, “An algorithm for Approximate Tandem Repeats,” Fourth Annual Symposium on Combinatorial Pattern Matching, pp 120–133, 1993.Google Scholar
- 5.V.I. Levenshtein, “Binary codes capable of correcting deletions, insertions and reversals,” Soviet Phys. Dokl., 10, pp707–710, 1966.Google Scholar
- 6.W. Miller, “An algorithm for locating a repeating region,” manuscript, 1992.Google Scholar
- 7.E. Myers, “An O(ND) difference algorithm and its variants,” Algorithmica, 1, pp 251–266, 1986.Google Scholar
- 8.T.F. Smith and M.S. Waterman, “Identification of common molecular sequences,” J. Mol. Biol., 147, pp 195–197, 1981.Google Scholar
- 9.R.A. Wagner and M.J. Fisher, “The string-to-string correction problem,” J. ACM, 21, pp 168–173, 1974.Google Scholar