Advertisement

Using Secondary Structure Information to Perform Multiple Alignment

  • Giuliano Armano
  • Luciano Milanesi
  • Alessandro Orro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3737)

Abstract

In this paper an approach devised to perform multiple alignment is described, able to exploit any available secondary structure information. In particular, given the sequences to be aligned, their secondary structure (either available or predicted) is used to perform an initial alignment –to be refined by means of locally-scoped operators entrusted with “rearranging” the primary level. Aimed at evaluating both the performance of the technique and the impact of “true” secondary structure information on the quality of alignments, a suitable algorithm has been implemented and assessed on relevant test cases. Experimental results point out that the proposed solution is particularly effective when used to align low similarity protein sequences.

Keywords

Secondary Structure Multiple Alignment Secondary Level Pairwise Alignment Primary Level 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Barton, G.J., Sternberg, M.E.J.: A Strategy for the Rapid Multiple Alignment of Protein Sequences. Confidence Levels from Tertiary Structure Comparisons. J. Mol. Biol. 198, 327–337 (1987)CrossRefGoogle Scholar
  2. 2.
    Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)CrossRefGoogle Scholar
  3. 3.
    Carrillo, H., Lipman, D.J.: The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 48, 1073–1082 (1988)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Devereux, J., Haeberli, P., Smithies, O.: GCG package. Nucleic Acids Research 12, 387–395 (1984)CrossRefGoogle Scholar
  5. 5.
    Eddy, S.R.: Multiple alignment using hidden Markov models. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 114–120 (1995)Google Scholar
  6. 6.
    Feng, D.F., Doolittle, R.F.: Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360 (1987)CrossRefGoogle Scholar
  7. 7.
    Hogeweg, P., Hesper, B.: The alignment of sets of sequences and the construction of phylogenetic trees, an integrated method. J. Mol. Evol. 20, 175–186 (1984)CrossRefGoogle Scholar
  8. 8.
    Giunchiglia, F., Villafiorita, A., Walsh, T.: Theories of Abtraction. AI Communications 10, 167–176 (1997)Google Scholar
  9. 9.
    Gotoh, O.: Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments. J. Mol. Biol. 264, 823–838 (1996)CrossRefGoogle Scholar
  10. 10.
    Heringa, J.: Two strategies for sequence comparison: profile preprocessed and secondary structure-induced multiple alignment. Computers and Chemistry 23, 341–364 (1999)CrossRefGoogle Scholar
  11. 11.
    Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)CrossRefGoogle Scholar
  12. 12.
    Knoblock, C.A., Tenenberg, J.D., Yang, Q.: Characterizing Abstraction Hierarchies for Planning. In: Proc. of the Ninth National Conference on Artificial Intelligence, vol. 2, pp. 692–697 (1991)Google Scholar
  13. 13.
    Krogh, A., Brown, M., Mian, I.S., Sjlander, K., Haussler, D.: Hidden Markov Models in Computational Biology: Applications to Protein Modeling. J. Mol. Biol. 235, 1501–1531 (1994)CrossRefGoogle Scholar
  14. 14.
    Morgenstern, B., Dress, A., Werner, T.: Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12098–12103 (1996)zbMATHCrossRefGoogle Scholar
  15. 15.
    Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)CrossRefGoogle Scholar
  16. 16.
    Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996)CrossRefGoogle Scholar
  17. 17.
    Notredame, C., Holm, L., Higgins, D.G.: COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422 (1998)CrossRefGoogle Scholar
  18. 18.
    Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: A Novel Method for Fast and Accurate Multiple Sequence Alignment. J. Mol. Biol. 302, 205–217 (2000)CrossRefGoogle Scholar
  19. 19.
    Notredame, C.: Recent Progresses in Multiple Sequence Alignment: a Survey. Pharmaco-genomics 3(1), 131–144 (2002)Google Scholar
  20. 20.
    Plaisted, D.: Theorem Proving with Abstraction. Artificial Intelligence 16(1), 47–108 (1981)zbMATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the Prediction of Protein Secondary Structure in Three and Eight Classes Using Recurrent Neural Networks and Profiles. Proteins 47, 228–235 (2002)CrossRefGoogle Scholar
  22. 22.
    Prlic, A., Domingues, F.S., Sippl, M.J.: Structure-derived substitution matrices for alignment of distantly related sequences. Protein Eng. 13, 545–550 (2000)CrossRefGoogle Scholar
  23. 23.
    Rost, B.: Twilight zone of protein sequence alignments. Protein Engineering 12(2), 85–94 (1999)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Saitta, L., Zucker, J.D.: Semantic Abstraction for Concept Representation and Learning. In: Symposium on Abstraction, Reformulation and Approximation (SARA 1998), Pacific Grove, California, pp. 103–120 (1998)Google Scholar
  25. 25.
    Smith, R.F., Smith, T.F.: Pattern-Induced Multi-sequence Alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. Protein Eng. 5(1), 35–41 (1992)CrossRefGoogle Scholar
  26. 26.
    Taylor, W.R.: A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161–169 (1988)CrossRefGoogle Scholar
  27. 27.
    Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties, and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)CrossRefGoogle Scholar
  28. 28.
    Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research 27, 2682–2690 (1999)CrossRefGoogle Scholar
  29. 29.
    Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Giuliano Armano
    • 1
  • Luciano Milanesi
    • 2
  • Alessandro Orro
    • 1
  1. 1.University of CagliariCagliariItaly
  2. 2.ITB-CNRSegrate MilanoItaly

Personalised recommendations