Protein Multiple Sequence Alignment

Do, Chuong B.; Katoh, Kazutaka

doi:10.1007/978-1-59745-398-1_25

Chuong B. Do⁵ &
Kazutaka Katoh⁵

Part of the book series: Methods In Molecular Biology™ ((MIMB,volume 484))

4001 Accesses
43 Citations

Protein sequence alignment is the task of identifying evolutionarily or structurally related positions in a collection of amino acid sequences. Although the protein alignment problem has been studied for several decades, many recent studies have demonstrated considerable progress in improving the accuracy or scalability of multiple and pairwise alignment tools, or in expanding the scope of tasks handled by an alignment program. In this chapter, we review state-of-the-art protein sequence alignment and provide practical advice for users of alignment tools.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Notredame, C. (2002) Recent progress in multiple sequence alignment: a survey. Pharmacogenomics 3, 131–144.
Article PubMed CAS Google Scholar
Needleman, S. B. and Wunsch, C. D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453.
Article PubMed CAS Google Scholar
Smith, T. F. and Waterman, M. S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.
Article PubMed CAS Google Scholar
Gotoh, O. (1982) An improved algorithm for matching biological sequences. J. Mol. Biol. 162, 705–708.
Article PubMed CAS Google Scholar
Myers, E. W. and Miller, W. (1988) Optimal alignments in linear space. Comput. Appl. Biosci. 4, 11–17.
PubMed CAS Google Scholar
Murata, M., Richardson, J. S., and Sussman, J. L. (1985) Simultaneous comparison of three protein sequences. Proc. Natl. Acad. Sci. USA 82, 3073–3077.
Google Scholar
Waterman, M. S. and Jones, R. (1990) Consensus methods for DNA and protein sequence alignment. Methods Enzymol. 183, 221–237.
Article PubMed CAS Google Scholar
Durbin, R., Eddy, S. R., Krogh, A., and Mitchison, G. (1999) Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge.
Google Scholar
Gonnet, G. H., Korostensky, C., and Benner, S. (2000) Evaluation measures of multiple sequence alignments. J. Comput. Biol. 7, 261–276.
Article PubMed CAS Google Scholar
Wang, L. and Jiang, T. (1994) On the complexity of multiple sequence alignment. J. Comput. Biol. 1, 337–348.
PubMed CAS Google Scholar
Bonizzoni, P. and Della Vedova, G. (2001) The complexity of multiple sequence alignment with SP-score that is a metric. Theor. Comput. Sci. 259, 63–79.
Article Google Scholar
Just, W. (2001) Computational complexity of multiple sequence alignment with SP-score. J. Comput. Biol. 8, 615–623.
Article PubMed CAS Google Scholar
Elias, I. (2006) Settling the intractability of multiple alignment. J. Comput. Biol. 13, 1323–1339.
Article PubMed CAS Google Scholar
Lipman, D. J., Altschul, S. F., and Kececioglu, J. D. (1989) A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA 86, 4412–4415.
Google Scholar
Gupta, S. K., Kececioglu, J. D., and Schaffer, A. A. (1995) Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J. Comput. Biol. 2, 459–472.
Article PubMed CAS Google Scholar
Carrillo, H. and Lipman, D. (1988) The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 48, 1073–1082.
Article Google Scholar
Dress, A., Fullen, G., and Perrey, S. (1995) A divide and conquer approach to multiple alignment. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 107–113.
Google Scholar
Stoye, J., Perrey, S. W., and Dress, A. W. M. (1997) Improving the divide-and-conquer approach to sum-of-pairs multiple sequence alignment. Appl. Math. Lett. 10, 67–73.
Article Google Scholar
Stoye, J., Moulton, V., and Dress, A. W. (1997) DCA: an efficient implementation of the divide-and-conquer approach to simultaneous multiple sequence alignment. Comput. Appl. Biosci. 13, 625–626.
PubMed CAS Google Scholar
Stoye, J. (1998) Multiple sequence alignment with the divide-and-conquer method. Gene 211, GC45–56.
Article PubMed CAS Google Scholar
Reinert, K., Stoye, J., and Will, T. (2000) An iterative method for faster sum-of-pairs multiple sequence alignment. Bioinformatics 16, 808–814.
Article PubMed CAS Google Scholar
Holland, J. H. (1975) Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor.
Google Scholar
Zhang, C. and Wong, A. K. (1997) A genetic algorithm for multiple molecular sequence alignment. Comput. Appl. Biosci. 13, 565–581.
PubMed CAS Google Scholar
Anbarasu, L. A., Narayanasamy, P., and Sundararajan, V. (1998) Multiple sequence alignment using parallel genetic algorithms. SEAL.
Google Scholar
Chellapilla, K. and Fogel, G. B. (1999) Multiple sequence alignment using evolutionary programming. Congress on Evolutionary Computation.
Google Scholar
Gonzalez, R. R., Izquierdo, C. M., and Seijas, J. (1999) Multiple protein sequence comparison by genetic algorithms. SPIE-98.
Google Scholar
Cai, L., Juedes, D., and Liakhovitch, E. (2000) Evolutionary computation techniques for multiple sequence alignment. Congress on Evolutionary Computation.
Google Scholar
Zhang, G.-Z. and Huang, D.-S. (2004) Aligning multiple protein sequence by an improved genetic algorithm. IEEE International Joint Conference on Neural Networks.
Google Scholar
Notredame, C. and Higgins, D. G. (1996) SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524.
Article PubMed CAS Google Scholar
Isokawa, M., Takahashi, K., and Shimizu, T. (1996) Multiple sequence alignment using a genetic algorithm. Genome Inform. 7, 176–177.
Google Scholar
Harada, Y., Wayama, M., and Shimizu, T. (1997) An inspection of the multiple alignment methods with use of genetic algorithm. Genome Inform. 8, 272–273.
Google Scholar
Hanada, K., Yokoyama, T., and Shimizu, T. (2000) Multiple sequence alignment by genetic algorithm. Genome Inform. 11, 317–318.
CAS Google Scholar
Yokoyama, T., Watanabe, T., Taneda, A., and Shimizu, T. (2001) A web server for multiple sequence alignment using genetic algorithm. Genome Inform. 12, 382–383.
CAS Google Scholar
Nguyen, H. D., Yoshihara, I., Yamamori, K., and Yasunaga, M. (2002) A parallel hybrid genetic algorithm for multiple protein sequence alignment. Evol. Comput. 1, 309–314.
Google Scholar
Kirkpatrick, S., Gelatt, J., C. D., and Vecchi, M. P. (1983) Optimization by simulated annealing. Science 220, 671–680.
Article PubMed CAS Google Scholar
Ishikawa, M., Toya, T., Hoshida, M., Nitta, K., Ogiwara, A., and Kanehisa, M. (1993) Multiple sequence alignment by parallel simulated annealing. Comput. Appl. Biosci. 9, 267–273.
PubMed CAS Google Scholar
Kim, J., Pramanik, S., and Chung, M. J. (1994) Multiple sequence alignment using simulated annealing. Comput. Appl. Biosci. 10, 419–426.
PubMed CAS Google Scholar
Eddy, S. R. (1995) Multiple alignment using hidden Markov models. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 114–120.
Google Scholar
Ikeda, T. and Imai, H. (1999) Enhanced A* algorithms for multiple alignments: optimal alignments for several sequences and k-opt approximate alignments for large cases. Theor. Comput. Sci. 210, 341–374.
Article Google Scholar
Horton, P. (2001) Tsukuba BB: a branch and bound algorithm for local multiple alignment of DNA and protein sequences. J. Comput. Biol. 8, 283–303.
Article PubMed CAS Google Scholar
Reinert, K., Lenhof, H.-P., Mutzel, P., Mehlhorn, K., and Kececioglu, J. D. (1997) A branch-and-cut algorithm for multiple sequence alignment. RECOMB.
Google Scholar
Reinert, K., Stoye, J., and Will, T. (1999) Combining divide-and-conquer, the A*-algorithm and successive realignment approaches to speed up multiple sequence alignment. German Conference on Bioinformatics.
Google Scholar
Lermen, M. and Reinert, K. (2000) The practical use of the A* algorithm for exact multiple sequence alignment. J. Comput. Biol. 7, 655–671.
Article PubMed CAS Google Scholar
Feng, D. F. and Doolittle, R. F. (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360.
Article PubMed CAS Google Scholar
Taylor, W. R. (1987) Multiple sequence alignment by a pairwise algorithm. Comput. Appl. Biosci. 3, 81–87.
PubMed CAS Google Scholar
Taylor, W. R. (1988) A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161–169.
Article PubMed CAS Google Scholar
Kececioglu, J. and Starrett, D. (2004) Aligning alignments exactly. RECOMB.
Google Scholar
Kececioglu, J. and Zhang, W. (1998) Aligning alignments. CPM.
Google Scholar
Altschul, S. F. (1989) Gap costs for multiple sequence alignment. J. Theor. Biol. 138, 297–309.
Article PubMed CAS Google Scholar
Katoh, K., Misawa, K., Kuma, K., and Miyata, T. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066.
Article PubMed CAS Google Scholar
Edgar, R. C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797.
Article PubMed CAS Google Scholar
Huang, X. (1994) On global sequence alignment. Comput. Appl. Biosci. 10, 227–235.
PubMed CAS Google Scholar
Pei, J., Sadreyev, R., and Grishin, N. V. (2003) PCMA: fast and accurate multiple sequence alignment based on profile consistency. Bioinformatics 19, 427–428.
Article PubMed CAS Google Scholar
Smith, R. F. and Smith, T. F. (1992) Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. Protein Eng. 5, 35–41.
Article PubMed CAS Google Scholar
Yamada, S., Gotoh, O., and Yamana, H. (2006) Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost. BMC Bioinform. 7, 524.
Article Google Scholar
Gotoh, O. (1996) Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J. Mol. Biol. 264, 823–838.
Article PubMed CAS Google Scholar
Corpet, F. (1988) Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 16, 10881–10890.
Article PubMed CAS Google Scholar
Higgins, D. G. and Sharp, P. M. (1988) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73, 237–244.
Article PubMed CAS Google Scholar
Higgins, D. G. and Sharp, P. M. (1989) Fast and sensitive multiple sequence alignments on a microcomputer. Comput. Appl. Biosci. 5, 151–153.
PubMed CAS Google Scholar
Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680.
Article PubMed CAS Google Scholar
Katoh, K., Kuma, K., Toh, H., and Miyata, T. (2005) MAFFT version 5: improve- ment in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518.
Article PubMed CAS Google Scholar
Edgar, R. C. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinform. 5, 113.
Article CAS Google Scholar
Notredame, C., Holm, L., and Higgins, D. G. (1998) COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422.
Article PubMed CAS Google Scholar
Notredame, C., Higgins, D. G., and Heringa, J. (2000) T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217.
Article PubMed CAS Google Scholar
Lassmann, T. and Sonnhammer, E. L. (2005) Kalign–an accurate and fast multiple sequence alignment algorithm. BMC Bioinform. 6, 298.
Article CAS Google Scholar
Lee, C., Grasso, C., and Sharlow, M. F. (2002) Multiple sequence alignment using partial order graphs. Bioinformatics 18, 452–464.
Article PubMed CAS Google Scholar
Lee, C. (2003) Generating consensus sequences from partial order multiple sequence alignment graphs. Bioinformatics 19, 999–1008.
Article PubMed CAS Google Scholar
Grasso, C. and Lee, C. (2004) Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems. Bioinformatics 20, 1546–1556.
Article PubMed CAS Google Scholar
Do, C. B., Mahabhashyam, M. S., Brudno, M., and Batzoglou, S. (2005) ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res. 15, 330–340.
Article PubMed CAS Google Scholar
Pei, J. and Grishin, N. V. (2006) MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucleic Acids Res. 34, 4364–4374.
Article PubMed CAS Google Scholar
Pei, J. and Grishin, N. V. (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23, 802–808.
Article PubMed CAS Google Scholar
Gribskov, M., McLachlan, A. D., and Eisenberg, D. (1987) Profile analysis: detection of distantly related proteins. Proc. Natl. Acad. Sci. US A 84, 4355–4358.
Google Scholar
von Ohsen, N., Sommer, I., and Zimmer, R. (2003) Profile-profile alignment: a powerful tool for protein structure prediction. Pac. Symp. Biocomput. 252–263.
Google Scholar
von Ohsen, N., Sommer, I., Zimmer, R., and Lengauer, T. (2004) Arby: automatic protein structure prediction using profile-profile alignment and confidence measures. Bioinformatics 20, 2228–2235.
Article Google Scholar
Soding, J. (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960.
Article PubMed Google Scholar
von Ohsen, N. and Zimmer, R. (2001) Improving profile-profile alignments via log-average scoring. WABI.
Google Scholar
Yona, G. and Levitt, M. (2002) Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J. Mol. Biol. 315, 1257–1275.
Article PubMed CAS Google Scholar
Heger, A. and Holm, L. (2003) Exhaustive enumeration of protein domain families. J. Mol. Biol. 328, 749–767.
Article PubMed CAS Google Scholar
Mittelman, D., Sadreyev, R., and Grishin, N. (2003) Probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments. Bioinformatics 19, 1531–1539.
Article PubMed CAS Google Scholar
Sadreyev, R. and Grishin, N. (2003) COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J. Mol. Biol. 326, 317–336.
Article PubMed CAS Google Scholar
Edgar, R. C. and Sjolander, K. (2004) COACH: profile-profile alignment of protein families using hidden Markov models. Bioinformatics 20, 1309–1318.
Article PubMed CAS Google Scholar
Rychlewski, L., Jaroszewski, L., Li, W., and Godzik, A. (2000) Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci. 9, 232–241.
CAS Google Scholar
Edgar, R. C. and Sjolander, K. (2004) A comparison of scoring functions for protein sequence profile alignment. Bioinformatics 20, 1301–1308.
Article PubMed CAS Google Scholar
Ohlson, T., Wallner, B., and Elofsson, A. (2004) Profile-profile methods provide improved fold-recognition: a study of different profile–profile alignment methods. Proteins 57, 188–197.
Article PubMed CAS Google Scholar
Sokal, R. R. and Michener, C. D. (1958) A statistical method for evaluating systematic relationships. Univ. Kans. Sci. Bull. 28, 1409–1438.
Google Scholar
Sneath, P. H. and Sokal, R. R. (1962) Numerical taxonomy. Nature 193, 855–860.
Article PubMed CAS Google Scholar
Saitou, N. and Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425.
PubMed CAS Google Scholar
Studier, J. A. and Keppler, K. J. (1988) A note on the neighbor-joining algorithm of Saitou and Nei. Mol. Biol. Evol. 5, 729–731.
PubMed CAS Google Scholar
Jones, D. T., Taylor, W. R., and Thornton, J. M. (1992) The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8, 275–282.
PubMed CAS Google Scholar
Edgar, R. C. (2004) Local homology recognition and distance measures in linear time using compressed amino acid alphabets. Nucleic Acids Res. 32, 380–385.
Article PubMed CAS Google Scholar
Wu, S. and Manber, U. (1992) Fast text searching allowing errors. Commun. ACM 35, 83–91.
Article Google Scholar
Vingron, M. and Argos, P. (1989) A fast and sensitive multiple sequence alignment algorithm. Comput. Appl. Biosci. 5, 115–121.
PubMed CAS Google Scholar
Vingron, M. and Argos, P. (1990) Determination of reliable regions in protein sequence alignments. Protein Eng. 3, 565–569.
Article PubMed CAS Google Scholar
Vingron, M. and Argos, P. (1991) Motif recognition and alignment for many sequences by comparison of dot-matrices. J. Mol. Biol. 218, 33–43.
Article PubMed CAS Google Scholar
Gotoh, O. (1990) Consistency of optimal sequence alignments. Bull. Math. Biol. 52, 509–525.
PubMed CAS Google Scholar
Van Walle, I., Lasters, I., and Wyns, L. (2003) Consistency matrices: quantified structure alignments for sets of related proteins. Proteins 51, 1–9.
Article PubMed CAS Google Scholar
Van Walle, I., Lasters, I., and Wyns, L. (2004) Align-m–a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics 20, 1428–1435.
Article PubMed CAS Google Scholar
Do, C. B., Gross, S. S., and Batzoglou, S. (2006) CONTRAlign: discriminative training for protein sequence alignment. RECOMB.
Google Scholar
Lolkema, J. S. and Slotboom, D. J. (1998) Hydropathy profile alignment: a tool to search for structural homologues of membrane proteins. FEMS Microbiol. Rev. 22, 305–322.
Article PubMed CAS Google Scholar
Altschul, S. F., Carroll, R. J., and Lipman, D. J. (1989) Weights for data related by a tree. J. Mol. Biol. 207, 647–653.
Article PubMed CAS Google Scholar
Vingron, M. and Sibbald, P. R. (1993) Weighting in sequence space: a comparison of methods in terms of generalized sequences. Proc. Natl. Acad. Sci. USA 90, 8777–8781.
Google Scholar
Sibbald, P. R. and Argos, P. (1990) Weighting aligned protein or nucleic acid sequences to correct for unequal representation. J. Mol. Biol. 216, 813–818.
Article PubMed CAS Google Scholar
Henikoff, S. and Henikoff, J. G. (1994) Position-based sequence weights. J. Mol. Biol. 243, 574–578.
Article PubMed CAS Google Scholar
Eddy, S. R., Mitchison, G., and Durbin, R. (1995) Maximum discrimination hidden Markov models of sequence consensus. J. Comput. Biol. 2, 9–23.
Article PubMed CAS Google Scholar
Gotoh, O. (1995) A weighting system and algorithm for aligning many phylogenetically related sequences. Comput. Appl. Biosci. 11, 543–551.
PubMed CAS Google Scholar
Krogh, A. and Mitchison, G. (1995) Maximum entropy weighting of aligned sequences of proteins or DNA. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 215–221.
Google Scholar
Karchin, R. and Hughey, R. (1998) Weighting hidden Markov models for maximum discrimination. Bioinformatics 14, 772–782.
Article PubMed CAS Google Scholar
May, A. C. (2001) Optimal classification of protein sequences and selection of representative sets from multiple alignments: application to homologous families and lessons for structural genomics. Protein Eng. 14, 209–217.
Article PubMed CAS Google Scholar
Hirosawa, M., Totoki, Y., Hoshida, M., and Ishikawa, M. (1995) Comprehensive study on iterative algorithms of multiple sequence alignment. Comput. Appl. Biosci. 11, 13–18.
PubMed CAS Google Scholar
Wang, Y. and Li, K. B. (2004) An adaptive and iterative algorithm for refining multiple sequence alignment. Comput. Biol. Chem. 28, 141–148.
Article PubMed CAS Google Scholar
Wallace, I. M., O’Sullivan, O., and Higgins, D. G. (2005) Evaluation of iterative alignment algorithms for multiple alignment. Bioinformatics 21, 1408–1414.
Article PubMed CAS Google Scholar
Brocchieri, L. and Karlin, S. (1998) A symmetric-iterated multiple alignment of protein sequences. J. Mol. Biol. 276, 249–264.
Article PubMed CAS Google Scholar
Subbiah, S. and Harrison, S. C. (1989) A method for multiple sequence alignment with gaps. J. Mol. Biol. 209, 539–548.
Article PubMed CAS Google Scholar
Barton, G. J. and Sternberg, M. J. (1987) A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. J. Mol. Biol. 198, 327–337.
Article PubMed CAS Google Scholar
Barton, G. J. and Sternberg, M. J. (1987) Evaluation and improvements in the automatic alignment of protein sequences. Protein Eng. 1, 89–94.
Article PubMed CAS Google Scholar
Bains, W. (1986) MULTAN: a program to align multiple DNA sequences. Nucleic Acids Res. 14, 159–177.
Article PubMed CAS Google Scholar
Thompson, J. D., Thierry, J. C., and Poch, O. (2003) RASCAL: rapid scanning and correction of multiple sequence alignments. Bioinformatics 19, 1155–1161.
Article PubMed CAS Google Scholar
Chakrabarti, S., Lanczycki, C. J., Panchenko, A. R., Przytycka, T. M., Thiessen, P. A., and Bryant, S. H. (2006) State of the art: refinement of multiple sequence alignments. BMC Bioinform. 7, 499.
Google Scholar
Chakrabarti, S., Lanczycki, C. J., Panchenko, A. R., Przytycka, T. M., Thiessen, P. A., and Bryant, S. H. (2006) Refining multiple sequence alignments with conserved core regions. Nucleic Acids Res. 34, 2598–2606.
Article PubMed CAS Google Scholar
Huang, X. Q., Hardison, R. C., and Miller, W. (1990) A space-efficient algorithm for local similarities. Comput. Appl. Biosci. 6, 373–381.
PubMed CAS Google Scholar
Huang, X. and Miller, W. (1991) A time-efficient, linear-space local similarity algorithm. Adv. Appl. Math. 12, 337–357.
Article Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
PubMed CAS Google Scholar
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
Article PubMed CAS Google Scholar
Pearson, W. R. (1998) Empirical statistical estimates for sequence similarity searches. J. Mol. Biol. 276, 71–84.
Article PubMed CAS Google Scholar
Pearson, W. R. (1990) Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol. 183, 63–98.
Article PubMed CAS Google Scholar
Pearson, W. R. (2000) Flexible sequence similarity searching with the FASTA3 program package. Methods Mol. Biol. 132, 185–219.
PubMed CAS Google Scholar
Morgenstern, B., Dress, A., and Werner, T. (1996) Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12098–12103.
Google Scholar
Morgenstern, B., Frech, K., Dress, A., and Werner, T. (1998) DIALIGN: finding local similarities by multiple sequence alignment. Bioinformatics 14, 290–294.
Article PubMed CAS Google Scholar
Morgenstern, B. (1999) DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15, 211–218.
Article PubMed CAS Google Scholar
Morgenstern, B. (2004) DIALIGN: multiple DNA and protein sequence alignment at BiBiServ. Nucleic Acids Res. 32, W33–36.
Article PubMed CAS Google Scholar
Subramanian, A. R., Weyer-Menkhoff, J., Kaufmann, M., and Morgenstern, B. (2005) DIALIGN-T: an improved algorithm for segment-based multiple sequence alignment. BMC Bioinform. 6, 66.
Article CAS Google Scholar
Depiereux, E. and Feytmans, E. (1992) MATCH-BOX: a fundamentally new algorithm for the simultaneous alignment of several protein sequences. Comput. Appl. Biosci. 8, 501–509.
PubMed CAS Google Scholar
Depiereux, E., Baudoux, G., Briffeuil, P., Reginster, I., De Bolle, X., Vinals, C., et al. (1997) Match-Box_server: a multiple sequence alignment tool placing emphasis on reliability. Comput. Appl. Biosci. 13, 249–256.
PubMed CAS Google Scholar
Schwartz, A. S. and Pachter, L. (2007) Multiple alignment by sequence annealing. Bioinformatics 23, e24–29.
Article PubMed CAS Google Scholar
Pellegrini, M., Marcotte, E. M., and Yeates, T. O. (1999) A fast algorithm for genome-wide analysis of proteins with repeated sequences. Proteins 35, 440–446.
Article PubMed CAS Google Scholar
Notredame, C. (2001) Mocca: semi-automatic method for domain hunting. Bioinformatics 17, 373–374.
Article PubMed CAS Google Scholar
Heger, A. and Holm, L. (2000) Rapid automatic detection and alignment of repeats in protein sequences. Proteins 41, 224–237.
Article PubMed CAS Google Scholar
Heringa, J. and Argos, P. (1993) A method to recognize distant repeats in protein sequences. Proteins 17, 391–341.
Article PubMed CAS Google Scholar
Szklarczyk, R. and Heringa, J. (2004) Tracking repeats using significance and transitivity. Bioinformatics 20(Suppl 1), I311–I317.
Article PubMed CAS Google Scholar
Sammeth, M. and Heringa, J. (2006) Global multiple-sequence alignment with repeats. Proteins 64, 263–274.
Article PubMed CAS Google Scholar
Lawrence, C. E., Altschul, S. F., Boguski, M. S., Liu, J. S., Neuwald, A. F., and Wootton, J. C. (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214.
Article PubMed CAS Google Scholar
Neuwald, A. F., Liu, J. S., and Lawrence, C. E. (1995) Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Sci. 4, 1618–1632.
Article PubMed CAS Google Scholar
Henikoff, S., Henikoff, J. G., Alford, W. J., and Pietrokovski, S. (1995) Automated construction and graphical presentation of protein blocks from unaligned sequences. Gene 163, GC17–26.
Article PubMed CAS Google Scholar
Smith, H. O., Annau, T. M., and Chandrasegaran, S. (1990) Finding sequence motifs in groups of functionally related proteins. Proc. Natl. Acad. Sci. USA 87, 826–830.
Google Scholar
Bailey, T. L. and Elkan, C. (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36.
Google Scholar
Sonnhammer, E. L. and Kahn, D. (1994) Modular arrangement of proteins as inferred from analysis of homology. Protein Sci. 3, 482–492.
Article PubMed CAS Google Scholar
Schuler, G. D., Altschul, S. F., and Lipman, D. J. (1991) A workbench for multiple alignment construction and analysis. Proteins 9, 180–190.
Article PubMed CAS Google Scholar
Pevzner, P. A., Tang, H., and Tesler, G. (2004) De novo repeat classification and fragment assembly. Genome Res. 14, 1786–1796.
Article PubMed CAS Google Scholar
Raphael, B., Zhi, D., Tang, H., and Pevzner, P. (2004) A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Res. 14, 2336–2346.
Article PubMed CAS Google Scholar
Phuong, T. M., Do, C. B., Edgar, R. C., and Batzoglou, S. (2006) Multiple alignment of protein sequences with repeats and rearrangements. Nucleic Acids Res. 34, 5932–5942.
Article PubMed CAS Google Scholar
Bishop, M. J. and Thompson, E. A. (1986) Maximum likelihood alignment of DNA sequences. J. Mol. Biol. 190, 159–165.
Article PubMed CAS Google Scholar
Hein, J., Wiuf, C., Knudsen, B., Moller, M. B., and Wibling, G. (2000) Statistical alignment: computational properties, homology testing and goodness-of-fit. J. Mol. Biol. 302, 265–279.
Article PubMed CAS Google Scholar
Thorne, J. L., Kishino, H., and Felsenstein, J. (1991) An evolutionary model for maximum likelihood alignment of DNA sequences. J. Mol. Evol. 33, 114–124.
Article PubMed CAS Google Scholar
Thorne, J. L., Kishino, H., and Felsenstein, J. (1992) Inching toward reality: an improved likelihood model of sequence evolution. J. Mol. Evol. 34, 3–16.
Article PubMed CAS Google Scholar
Miklos, I. and Toroczkai, Z. (2001) An improved model for statistical alignment. WABI.
Google Scholar
Miklos, I. (2003) Algorithm for statistical alignment of sequences derived from a Poisson sequence length distribution. Disc. Appl. Math. 127, 79–84.
Article Google Scholar
Miklos, I., Lunter, G. A., and Holmes, I. (2004) A “Long Indel” model for evolutionary sequence alignment. Mol. Biol. Evol. 21, 529–540.
Article PubMed CAS Google Scholar
Knudsen, B. and Miyamoto, M. M. (2003) Sequence alignments and pair hidden Markov models using evolutionary history. J. Mol. Biol. 333, 453–460.
Article PubMed CAS Google Scholar
Metzler, D. (2003) Statistical alignment based on fragment insertion and deletion models. Bioinformatics 19, 490–499.
Article PubMed CAS Google Scholar
Hein, J. (2001) A generalisation of the Thorne-Kishino-Felsenstein model of statistical alignment to k sequences related by a binary tree. PSB.
Google Scholar
Hein, J., Jensen, J. L., and Pedersen, C. N. (2003) Recursions for statistical multiple alignment. Proc. Natl. Acad. Sci. USA 100, 14960–14965.
Google Scholar
Holmes, I. and Bruno, W. J. (2001) Evolutionary HMMs: a Bayesian approach to multiple alignment. Bioinformatics 17, 803–820.
Article PubMed CAS Google Scholar
Holmes, I. (2003) Using guide trees to construct multiple-sequence evolutionary HMMs. Bioinformatics 19(Suppl 1), i147–157.
Article PubMed Google Scholar
Steel, M. and Hein, J. (2001) Applying the Thorne-Kishino-Felsenstein model to sequence evolution on a star-shaped tree. Appl. Math. Lett. 14, 679–684.
Article Google Scholar
Miklos, I. (2002) An improved algorithm for statistical alignment of sequences related by a star tree. Bull. Math. Biol. 64, 771–779.
Article PubMed CAS Google Scholar
Lunter, G. A., Miklos, I., Song, Y. S., and Hein, J. (2003) An efficient algorithm for statistical multiple alignment on arbitrary phylogenetic trees. J. Comput. Biol. 10, 869–889.
Article PubMed CAS Google Scholar
Jensen, J. L. and Hein, J. (2005) Gibbs sampler for statistical multiple alignment. Stat. Sin. 15, 889–907.
Google Scholar
Hein, J. (1990) Unified approach to alignment and phylogenies. Methods Enzymol. 183, 626–645.
Article PubMed CAS Google Scholar
Vingron, M. and von Haeseler, A. (1997) Towards integration of multiple alignment and phylogenetic tree construction. J. Comput. Biol. 4, 23–34.
Article PubMed CAS Google Scholar
Fleissner, R., Metzler, D., and von Haeseler, A. (2005) Simultaneous statistical multiple alignment and phylogeny reconstruction. Syst. Biol. 54, 548–561.
Article PubMed Google Scholar
Lunter, G., Miklos, I., Drummond, A., Jensen, J. L., and Hein, J. (2005) Bayesian coestimation of phylogeny and sequence alignment. BMC Bioinform. 6, 83.
Article CAS Google Scholar
Redelings, B. D. and Suchard, M. A. (2005) Joint Bayesian estimation of alignment and phylogeny. Syst. Biol. 54, 401–418.
Article PubMed Google Scholar
Metzler, D., Fleissner, R., Wakolbinger, A., and von Haeseler, A. (2001) Assessing variability by joint sampling of alignments and mutation rates. J. Mol. Evol. 53, 660–669.
Article PubMed CAS Google Scholar
Allison, L. and Wallace, C. S. (1994) The posterior probability distribution of alignments and its application to parameter estimation of evolutionary trees and to optimization of multiple alignments. J. Mol. Evol. 39, 418–430.
Article PubMed CAS Google Scholar
Krogh, A., Brown, M., Mian, I. S., Sjolander, K., and Haussler, D. (1994) Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235, 1501–1531.
Article PubMed CAS Google Scholar
Krogh, A. (1998) An introduction to hidden Markov models for biological sequences. In Computational Methods in Molecular Biology (Salzberg, S., Searls, D., Kasif, S., eds.). Elsevier Science, St. Louis, MO, pp. 45–63.
Chapter Google Scholar
Hughey, R. and Krogh, A. (1996) Hidden Markov models for sequence analysis: extension and analysis of the basic method. Comput. Appl. Biosci. 12, 95–107.
PubMed CAS Google Scholar
Eddy, S. R. (1996) Hidden Markov models. Curr. Opin. Struct. Biol. 6, 361–365.
Article PubMed CAS Google Scholar
Eddy, S. R. (1998) Profile hidden Markov models. Bioinformatics 14, 755–763.
Article PubMed CAS Google Scholar
Mamitsuka, H. (2005) Finding the biologically optimal alignment of multiple sequences. Artif. Intell. Med. 35, 9–18.
Article PubMed Google Scholar
Baldi, P. and Chauvin, Y. (1994) Smooth on-line learning algorithms for hidden Markov models. Neural Comput. 6, 307–318.
Article Google Scholar
Baldi, P., Chauvin, Y., Hunkapiller, T., and McClure, M. A. (1994) Hidden Markov models of biological primary sequence information. Proc. Natl. Acad. Sci. USA 91, 1059–1063.
Google Scholar
Viterbi, A. J. (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory It13, 260.
Article Google Scholar
Grundy, W. N., Bailey, T. L., Elkan, C. P., and Baker, M. E. (1997) Meta-MEME: motif-based hidden Markov models of protein families. Comput. Appl. Biosci. 13, 397–406.
PubMed CAS Google Scholar
Bucher, P., Karplus, K., Moeri, N., and Hofmann, K. (1996) A flexible motif search technique based on generalized profiles. Comput. Chem. 20, 3–23.
Article PubMed CAS Google Scholar
Karplus, K., Barrett, C., and Hughey, R. (1998) Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856.
Article PubMed CAS Google Scholar
Park, J., Karplus, K., Barrett, C., Hughey, R., Haussler, D., Hubbard, T., et al. (1998) Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J. Mol. Biol. 284, 1201–1210.
Article PubMed CAS Google Scholar
Sonnhammer, E. L., Eddy, S. R., Birney, E., Bateman, A., and Durbin, R. (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res. 26, 320–322.
Article PubMed CAS Google Scholar
Eddy, S. R. HMMER: a profile hidden Markov modeling package, available from http://hmmer.janelia.org/.
Sjolander, K., Karplus, K., Brown, M., Hughey, R., Krogh, A., Mian, I. S., et al. (1996) Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Comput. Appl. Biosci. 12, 327–345.
PubMed CAS Google Scholar
Barrett, C., Hughey, R., and Karplus, K. (1997) Scoring hidden Markov models. Comput. Appl. Biosci. 13, 191–199.
PubMed CAS Google Scholar
McClure, M. A., Smith, C., and Elton, P. (1996) Parameterization studies for the SAM and HMMER methods of hidden Markov model generation. Proc. Int. Conf. Intell. Syst. Mol. Biol. 4, 155–164.
Google Scholar
Karplus, K. and Hu, B. (2001) Evaluation of protein multiple alignments by SAM-T99 using the BAliBASE multiple alignment test set. Bioinformatics 17, 713–720.
Article PubMed CAS Google Scholar
Loytynoja, A. and Milinkovitch, M. C. (2003) A hidden Markov model for progressive multiple alignment. Bioinformatics 19, 1505–1513.
Article PubMed CAS Google Scholar
Edgar, R. C. and Sjolander, K. (2003) Simultaneous sequence alignment and tree construction using hidden Markov models. Pac. Symp. Biocomput. 180–191.
Google Scholar
Edgar, R. C. and Sjolander, K. (2003) SATCHMO: sequence alignment and tree construction using hidden Markov models. Bioinformatics 19, 1404–1411.
Article PubMed CAS Google Scholar
Loytynoja, A. and Goldman, N. (2005) An algorithm for progressive multiple alignment of sequences with insertions. Proc. Natl. Acad. Sci. USA 102, 10557–10562.
Google Scholar
Holmes, I. and Durbin, R. (1998) Dynamic programming alignment accuracy. J. Comput. Biol. 5, 493–504.
Article PubMed CAS Google Scholar
Schwartz, A. S., Myers, E., and Pachter, L. (2006) Alignment metric accuracy. arXiv 2006:q-bio.QM/0510052.
Google Scholar
Roshan, U. and Livesay, D. R. (2006) Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics 22, 2715–2721.
Article PubMed CAS Google Scholar
Wallace, I. M., O’Sullivan, O., Higgins, D. G., and Notredame, C. (2006) M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 34, 1692–1699.
Article PubMed CAS Google Scholar
Kececioglu, J. D. (1993) The maximum weight trace problem in multiple sequence alignment. CPM.
Google Scholar
Kececioglu, J. D., Lenhof, H.-P., Mehlhorn, K., Mutzel, P., Reinert, K., and Vingron, M. (2000) A polyhedral approach to sequence alignment problems. Disc. Appl. Math. 104, 143–186.
Article Google Scholar
Koller, G. and Raidl, G. R. (2004) An evolutionary algorithm for the maximum weight trace formulation of the multiple sequence alignment problem. In LNCS, 3242, pp. 302–311.
Google Scholar
Simossis, V. A. and Heringa, J. (2005) PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information. Nucleic Acids Res. 33, W289–294.
Article PubMed CAS Google Scholar
Simossis, V. A., Kleinjung, J., and Heringa, J. (2005) Homology-extended sequence alignment. Nucleic Acids Res. 33, 816–824.
Article PubMed CAS Google Scholar
Thompson, J. D., Plewniak, F., Thierry, J., and Poch, O. (2000) DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches. Nucleic Acids Res. 28, 2919–2926.
Article PubMed CAS Google Scholar
Wang, J. and Feng, J. A. (2005) NdPASA: a novel pairwise protein sequence alignment algorithm that incorporates neighbor-dependent amino acid propensities. Proteins 58, 628–637.
Article PubMed CAS Google Scholar
Yang, A. S. (2002) Structure-dependent sequence alignment for remotely related proteins. Bioinformatics 18, 1658–1665.
Article PubMed CAS Google Scholar
Zhou, H. and Zhou, Y. (2005) SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures. Bioinformatics 21, 3615–3621.
Article PubMed CAS Google Scholar
O’Sullivan, O., Suhre, K., Abergel, C., Higgins, D. G., and Notredame, C. (2004) 3DCoffee: combining protein sequences and structures within multiple sequence alignments. J. Mol. Biol. 340, 385–395.
Article PubMed CAS Google Scholar
Armougom, F., Moretti, S., Poirot, O., Audic, S., Dumas, P., Schaeli, B., et al. (2006) Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee. Nucleic Acids Res. 34, W604–608.
Article PubMed CAS Google Scholar
Thompson, J. D., Plewniak, F., and Poch, O. (1999) BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88.
Article PubMed CAS Google Scholar
Thompson, J. D., Plewniak, F., and Poch, O. (1999) A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27, 2682–2690.
Article PubMed CAS Google Scholar
Mizuguchi, K., Deane, C. M., Blundell, T. L., and Overington, J. P. (1998) HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci. 7, 2469–2471.
Article PubMed CAS Google Scholar
Van Walle, I., Lasters, I., and Wyns, L. (2005) SABmark–a benchmark for sequence alignment that covers the entire known fold space. Bioinformatics 21, 1267–1268.
Article PubMed Google Scholar
Raghava, G. P., Searle, S. M., Audley, P. C., Barber, J. D., and Barton, G. J. (2003) OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinform. 4, 47.
Article CAS Google Scholar
Thompson, J. D., Koehl, P., Ripp, R., and Poch, O. (2005) BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins 61, 127–136.
Article PubMed CAS Google Scholar
Sauder, J. M., Arthur, J. W., and Dunbrack, R. L., Jr. (2000) Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins 40, 6–22.
Article PubMed CAS Google Scholar
Pang, A., Smith, A. D., Nuin, P. A., and Tillier, E. R. (2005) SIMPROT: using an empirically determined indel distribution in simulations of protein evolution. BMC Bioinform. 6, 236.
Article CAS Google Scholar
Nuin, P. A., Wang, Z., and Tillier, E. R. (2006) The accuracy of several multiple sequence alignment programs for proteins. BMC Bioinform. 7, 471.
Article CAS Google Scholar
Stoye, J., Evers, D., and Meyer, F. (1998) Rose: generating sequence families. Bioinformatics 14, 157–163.
Article PubMed CAS Google Scholar
Eidhammer, I., Jonassen, I., and Taylor, W. R. (2000) Structure comparison and structure patterns. J. Comput. Biol. 7, 685–716.
Article PubMed CAS Google Scholar
Carugo, O. and Pongor, S. (2001) A normalized root-mean-square distance for comparing protein three-dimensional structures. Protein Sci. 10, 1470–1473.
Article PubMed CAS Google Scholar
Armougom, F., Moretti, S., Keduas, V., and Notredame, C. (2006) The iRMSD: a local measure of sequence alignment accuracy using structural information. Bioinformatics 22, e35–39.
Article PubMed CAS Google Scholar
Chew, L. P., Huttenlocher, D., Kedem, K., and Kleinberg, J. (1999) Fast detection of common geometric substructure in proteins. J. Comput. Biol. 6, 313–325.
Article PubMed CAS Google Scholar
O’Sullivan, O., Zehnder, M., Higgins, D., Bucher, P., Grosdidier, A., and Notredame, C. (2003) APDB: a novel measure for benchmarking sequence alignment methods without reference alignments. Bioinformatics 19(Suppl 1), i215–221.
Article PubMed Google Scholar
Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10915–10919.
Google Scholar
Dayhoff, M. O., Eck, R. V., and Park, C. M. (1972) A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure (Dayhoff, M. O., ed.). National Biomedical Research Foundation, Washington, DC, pp. 89–99.
Google Scholar
Dayhoff, M. O., Schwartz, R. M., and Orcutt, B. C. (1978) A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure (Dayhoff, M. O., ed.). National Biomedical Research Foundation, Washington, DC, pp. 345–352.
Google Scholar
Muller, T. and Vingron, M. (2000) Modeling amino acid replacement. J. Comput. Biol. 7, 761–776.
Article PubMed CAS Google Scholar
Whelan, S. and Goldman, N. (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18, 691–699.
PubMed CAS Google Scholar
Prlic, A., Domingues, F. S., and Sippl, M. J. (2000) Structure-derived substitution matrices for alignment of distantly related sequences. Protein Eng. 13, 545–550.
Article PubMed CAS Google Scholar
Reese, J. T. and Pearson, W. R. (2002) Empirical determination of effective gap penalties for sequence comparison. Bioinformatics 18, 1500–1507.
Article PubMed CAS Google Scholar
Arribas-Gil, A., Gassiat, E., and Matias, C. (2006) Parameter estimation in pair-hidden Markov models. Scand. J. Stat. 33, 651–671.
Article Google Scholar
Liu, J. S., Neuwald, A. F., and Lawrence, C. E. (1995) Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Am. Stat. Assoc. 90, 1156–1170.
Article Google Scholar
Zhu, J., Liu, J. S., and Lawrence, C. E. (1998) Bayesian adaptive sequence alignment algorithms. Bioinformatics 14, 25–39.
Article PubMed CAS Google Scholar
Kececioglu, J. and Kim, E. (2007) Simple and fast inverse alignment. RECOMB.
Google Scholar
Yu, C.-N., Joachims, T., Elber, R., and Pillardy, J. (2007) Support vector training of protein alignment models. RECOMB.
Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., and Altun, Y. (2005) Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484.
Google Scholar
Katoh, K. and Toh, H. (2007) PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences. Bioinformatics 23, 372–374.
Article PubMed CAS Google Scholar
Ahola, V., Aittokallio, T., Vihinen, M., and Uusipaikka, E. (2006) A statistical score for assessing the quality of multiple sequence alignments. BMC Bioinform. 7, 484.
Article CAS Google Scholar
Altschul, S. F. (1998) Generalized affine gap costs for protein sequence alignment. Proteins 32, 88–96.
Article PubMed CAS Google Scholar
Zachariah, M. A., Crooks, G. E., Holbrook, S. R., and Brenner, S. E. (2005) A generalized affine gap model significantly improves protein sequence alignment accuracy. Proteins 58, 329–338.
Article PubMed CAS Google Scholar
Thompson, J. D., Muller, A., Waterhouse, A., Procter, J., Barton, G. J., Plewniak, F., et al. (2006) MACSIMS: multiple alignment of complete sequences information management system. BMC Bioinform. 7, 318.
Article CAS Google Scholar
Thompson, J. D., Holbrook, S. R., Katoh, K., Koehl, P., Moras, D., Westhof, E., et al. (2005) MAO: a multiple alignment ontology for nucleic acid and protein sequences. Nucleic Acids Res. 33, 4164–4171.
Article PubMed CAS Google Scholar
Gotoh, O. (1999) Multiple sequence alignment: algorithms and applications. Adv. Biophys. 36, 159–206.
Article PubMed CAS Google Scholar
Phillips, A., Janies, D., and Wheeler, W. (2000) Multiple sequence alignment in phylogenetic analysis. Mol. Phylogenet. Evol. 16, 317–330.
Article PubMed CAS Google Scholar
Lambert, C., Campenhout, J. M. V., DeBolle, X., and Depiereux, E. (2003) Review of common sequence alignment methods: clues to enhance reliability. Curr. Genom. 4, 131–146.
Article CAS Google Scholar
Wallace, I. M., Blackshields, G., and Higgins, D. G. (2005) Multiple sequence alignments. Curr. Opin. Struct. Biol. 15, 261–266.
Article PubMed CAS Google Scholar
Edgar, R. C. and Batzoglou, S. (2006) Multiple sequence alignment. Curr. Opin. Struct. Biol. 16, 368–373.
Article PubMed CAS Google Scholar
Morrison, D. A. (2006) Multiple sequence alignment for phylogenetic purposes. Aust. Syst. Bot. 19, 479–539.
Article CAS Google Scholar
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2001) Introduction to Algorithms. MIT Press, Cambridge, MA.
Google Scholar
Eppstein, D. (2000) Fast hierarchical clustering and other applications of dynamic closest pairs. J. Exp. Algorithmics 5, 1–23.
Article Google Scholar
Elias, I. and Lagergren, J. (2005) Fast neighbor joining. ICALP.
Google Scholar
Waterman, M. S., Eggert, M., and Lander, E. (1992) Parametric sequence comparisons. Proc. Natl. Acad. Sci. USA 89, 6090–6093.
Google Scholar
Waterman, M. S. (1994) Parametric and ensemble sequence alignment algorithms. Bull. Math. Biol. 56, 743–767.
PubMed CAS Google Scholar
Gusfield, D., Balasubramanian, K., and Naor, D. (1994) Parametric optimization of sequence alignment. Algorithmica 12, 312–326.
Article Google Scholar

Download references

Acknowledgments

We thank Karen Ann Lee for help in preparing the manuscript. C.B.D was funded by an NDSEG fellowship.

Author information

Authors and Affiliations

Computer Science Department, Stanford University, Stanford, CA, USA
Chuong B. Do & Kazutaka Katoh

Authors

Chuong B. Do
View author publications
You can also search for this author in PubMed Google Scholar
Kazutaka Katoh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France
Julie D. Thompson
LSMBO ECPM, Institut Pluridisciplinaire Hubert Curien, Strasbourg, France
Christine Schaeffer-Reiss
Department of Protein Science Helmholtz Zentrum München, German Research Center for Environmental Health, Munich-Neuherberg, Germany
Marius Ueffing

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Do, C.B., Katoh, K. (2009). Protein Multiple Sequence Alignment. In: Thompson, J.D., Schaeffer-Reiss, C., Ueffing, M. (eds) Functional Proteomics. Methods In Molecular Biology™, vol 484. Humana Press. https://doi.org/10.1007/978-1-59745-398-1_25

Download citation

DOI: https://doi.org/10.1007/978-1-59745-398-1_25
Publisher Name: Humana Press
Print ISBN: 978-1-58829-971-0
Online ISBN: 978-1-59745-398-1
eBook Packages: Springer Protocols

Publish with us

Policies and ethics