Advertisement

Dynamic Programming

Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1079)

Abstract

Independent scoring of the aligned sections to determine the quality of biological sequence alignments enables recursive definitions of the overall alignment score. This property is not only biologically meaningful but it also provides the opportunity to find the optimal alignments using dynamic programming-based algorithms. Dynamic programming is an efficient problem solving technique for a class of problems that can be solved by dividing into overlapping subproblems. Pairwise sequence alignment techniques such as Needleman–Wunsch and Smith–Waterman algorithms are applications of dynamic programming on pairwise sequence alignment problems. These algorithms offer polynomial time and space solutions. In this chapter, we introduce the basic dynamic programming solutions for global, semi-global, and local alignment problems. Algorithmic improvements offering quadratic-time and linear-space programs and approximate solutions with space-reduction and seeding heuristics are discussed. We finally introduce the application of these techniques on multiple sequence alignment briefly.

Key words

Dynamic programming Needleman–Wunsch algorithm Smith–Waterman algorithm Affine gap penalties Hirschberg’s algorithm Banded dynamic programming Bounded dynamic programming Seeding 

References

  1. 1.
    Waterman MS (1995) Introduction to computational biology. Chapman & Hall, LondonGoogle Scholar
  2. 2.
    Bellman R (1958) On a routing problem. Q Appl Math 16:87–90Google Scholar
  3. 3.
    Fredman ML, Tarjan RE (1987) Fibonacci heaps and their uses in improved network optimization algorithms. J Assoc Comput Mach 3:596–615CrossRefGoogle Scholar
  4. 4.
    Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequences of two proteins. J Mol Biol 48:443–453PubMedCrossRefGoogle Scholar
  5. 5.
    Waterman MS, Smith TF, Beyer WA (1976) Some biological sequence metrics. Adv Math 20:367–387CrossRefGoogle Scholar
  6. 6.
    Waterman MS, Smith TF (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197PubMedCrossRefGoogle Scholar
  7. 7.
    Waterman MS (1981) Sequence alignments in the neighborhood of the optimum with general application to dynamic programming. Proc Natl Acad Sci USA 80:3123–3124CrossRefGoogle Scholar
  8. 8.
    Gotoh O (1982) An improved algorithm for matching biological sequences. J Mol Biol 162:705–708PubMedCrossRefGoogle Scholar
  9. 9.
    Hirschberg DS (1975) Linear space algorithm for computing longest common subsequences. Commun Assoc Comput Mach 18:341–343Google Scholar
  10. 10.
    Myers E, Miller W (1988) Optimal alignments in linear space. CABIOS 4:11–17PubMedGoogle Scholar
  11. 11.
    Chao KM, Pearson WR, Miller W (1992) Aligning two sequences within a specified diagonal band. Comput Appl Biosci 8(5):481–487PubMedGoogle Scholar
  12. 12.
    Chao KM, Hardison RC, Miller W (1993) Constrained sequence alignment. Comput Appl Biosci 55(3):503–524Google Scholar
  13. 13.
    Hirschberg DS (1997) Serial computations of Lcvenshtein distances. In: Pattern matching algorithms. Oxford University Press, New York, pp 123–141Google Scholar
  14. 14.
    Huang X, Hardison RC, Miller W (1990) A space-efficient algorithm for local similarities. Comput Appl Biosci 6(4):373–381PubMedGoogle Scholar
  15. 15.
    Huang XQ, Miller W (1991) A time-efficient, linear-space local similarity algorithm. Adv Appl Math 12(3):337–357CrossRefGoogle Scholar
  16. 16.
    Chao KM, Hardison RC, Miller W (1994) Recent developments in linear-space alignment methods: a survey. J Comput Biol 1:271–291PubMedCrossRefGoogle Scholar
  17. 17.
    Spouge JL (1989) Speeding up dynamic programming algorithms for finding optimal lattice paths. SIAM J Appl Math 49(5):1552–1566CrossRefGoogle Scholar
  18. 18.
    Korf RE (1985) Depth-first iterative-deepening: an optimal admissible tree search. Artif Intell 27(1):97–109CrossRefGoogle Scholar
  19. 19.
    Ukkonen E (1985) Algorithms for approximate string matching. Inf Control 64:100–118CrossRefGoogle Scholar
  20. 20.
    Lipman DJ, Pearson WR (1985) Rapid and sensitive protein similarity searches. Science 227(4693):1435–1441PubMedCrossRefGoogle Scholar
  21. 21.
    Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85(8):2444–2448PubMedCrossRefGoogle Scholar
  22. 22.
    Altschul S, Gish W, Miller W, Myers E, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410PubMedGoogle Scholar
  23. 23.
    Altschul S, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  24. 24.
    Kielbasa SM, Wan R, Sato K, Horton P, Frith MC (2011) Adaptive seeds tame genomic sequence comparison. Genome Res 21(3):487–493PubMedCrossRefGoogle Scholar
  25. 25.
    Hohwald H, Thayer I, Korf RE (2003) Comparing best-first search and dynamic programming for optimal multiple sequence alignment. In: Proceedings of the 18th international joint conference on artificial intelligence, IJCAI’03. Morgan Kaufmann Publishers, San Francisco, pp 1239–1245Google Scholar
  26. 26.
    Korf RE, Zhan W (2000) Divide-and-conquer frontier search applied to optimal sequence alignment. In: Proceedings of the 7th conference on artificial intelligence (AAAI-00) and of the 12th conference on innovative applications of artificial intelligence (IAAI-00). AAA1 Press, Cambridge, MA, pp 910–916Google Scholar
  27. 27.
    Lipman DJ, Altschul SF, Kececioglu JD (1989) A tool for multiple sequence alignment. Proc Natl Acad Sci USA 86:4412–4415PubMedCrossRefGoogle Scholar
  28. 28.
    Stoye J, Perrey SW, Dress AWM (1997) Improving the divide- and-conquer approach to sum-of-pairs multiple sequence alignment. Appl Math Lett 10(2):67–73CrossRefGoogle Scholar
  29. 29.
    Sobel E, Martinez HM (1986) A multiple sequence alignment program. Nucleic Acids Res 14(1):363–374PubMedCrossRefGoogle Scholar
  30. 30.
    Morgenstern B, Frech K, Dress A, Werner T (1998) Dialign: finding local similarities by multiple sequence alignment. Bioinformatics 14(3):290–294PubMedCrossRefGoogle Scholar
  31. 31.
    Thompson JD, Higgins DG, Gibson TJ (1994) Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680PubMedCrossRefGoogle Scholar
  32. 32.
    Notredame C, Higgins DG, Heringa J (2000) T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217PubMedCrossRefGoogle Scholar
  33. 33.
    Edgar RC (2004) Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797PubMedCrossRefGoogle Scholar
  34. 34.
    Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066PubMedCrossRefGoogle Scholar
  35. 35.
    Russell DJ, Otu HH, and Sayood K (2008) Grammar-based distance in progressive multiple sequence alignment. BMC Bioinformatics, 9:306Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2014

Authors and Affiliations

  1. 1.Department of Electrical EngineeringUniversity of Nebraska-LincolnLincolnUSA

Personalised recommendations