Skip to main content

Using Secondary Structure Information to Perform Multiple Alignment

  • Conference paper
Transactions on Computational Systems Biology III

Abstract

In this paper an approach devised to perform multiple alignment is described, able to exploit any available secondary structure information. In particular, given the sequences to be aligned, their secondary structure (either available or predicted) is used to perform an initial alignment –to be refined by means of locally-scoped operators entrusted with “rearranging” the primary level. Aimed at evaluating both the performance of the technique and the impact of “true” secondary structure information on the quality of alignments, a suitable algorithm has been implemented and assessed on relevant test cases. Experimental results point out that the proposed solution is particularly effective when used to align low similarity protein sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barton, G.J., Sternberg, M.E.J.: A Strategy for the Rapid Multiple Alignment of Protein Sequences. Confidence Levels from Tertiary Structure Comparisons. J. Mol. Biol. 198, 327–337 (1987)

    Article  Google Scholar 

  2. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)

    Article  Google Scholar 

  3. Carrillo, H., Lipman, D.J.: The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 48, 1073–1082 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  4. Devereux, J., Haeberli, P., Smithies, O.: GCG package. Nucleic Acids Research 12, 387–395 (1984)

    Article  Google Scholar 

  5. Eddy, S.R.: Multiple alignment using hidden Markov models. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 114–120 (1995)

    Google Scholar 

  6. Feng, D.F., Doolittle, R.F.: Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360 (1987)

    Article  Google Scholar 

  7. Hogeweg, P., Hesper, B.: The alignment of sets of sequences and the construction of phylogenetic trees, an integrated method. J. Mol. Evol. 20, 175–186 (1984)

    Article  Google Scholar 

  8. Giunchiglia, F., Villafiorita, A., Walsh, T.: Theories of Abtraction. AI Communications 10, 167–176 (1997)

    Google Scholar 

  9. Gotoh, O.: Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments. J. Mol. Biol. 264, 823–838 (1996)

    Article  Google Scholar 

  10. Heringa, J.: Two strategies for sequence comparison: profile preprocessed and secondary structure-induced multiple alignment. Computers and Chemistry 23, 341–364 (1999)

    Article  Google Scholar 

  11. Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)

    Article  Google Scholar 

  12. Knoblock, C.A., Tenenberg, J.D., Yang, Q.: Characterizing Abstraction Hierarchies for Planning. In: Proc. of the Ninth National Conference on Artificial Intelligence, vol. 2, pp. 692–697 (1991)

    Google Scholar 

  13. Krogh, A., Brown, M., Mian, I.S., Sjlander, K., Haussler, D.: Hidden Markov Models in Computational Biology: Applications to Protein Modeling. J. Mol. Biol. 235, 1501–1531 (1994)

    Article  Google Scholar 

  14. Morgenstern, B., Dress, A., Werner, T.: Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12098–12103 (1996)

    Article  MATH  Google Scholar 

  15. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)

    Article  Google Scholar 

  16. Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996)

    Article  Google Scholar 

  17. Notredame, C., Holm, L., Higgins, D.G.: COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422 (1998)

    Article  Google Scholar 

  18. Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: A Novel Method for Fast and Accurate Multiple Sequence Alignment. J. Mol. Biol. 302, 205–217 (2000)

    Article  Google Scholar 

  19. Notredame, C.: Recent Progresses in Multiple Sequence Alignment: a Survey. Pharmaco-genomics 3(1), 131–144 (2002)

    Google Scholar 

  20. Plaisted, D.: Theorem Proving with Abstraction. Artificial Intelligence 16(1), 47–108 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  21. Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the Prediction of Protein Secondary Structure in Three and Eight Classes Using Recurrent Neural Networks and Profiles. Proteins 47, 228–235 (2002)

    Article  Google Scholar 

  22. Prlic, A., Domingues, F.S., Sippl, M.J.: Structure-derived substitution matrices for alignment of distantly related sequences. Protein Eng. 13, 545–550 (2000)

    Article  Google Scholar 

  23. Rost, B.: Twilight zone of protein sequence alignments. Protein Engineering 12(2), 85–94 (1999)

    Article  MathSciNet  Google Scholar 

  24. Saitta, L., Zucker, J.D.: Semantic Abstraction for Concept Representation and Learning. In: Symposium on Abstraction, Reformulation and Approximation (SARA 1998), Pacific Grove, California, pp. 103–120 (1998)

    Google Scholar 

  25. Smith, R.F., Smith, T.F.: Pattern-Induced Multi-sequence Alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. Protein Eng. 5(1), 35–41 (1992)

    Article  Google Scholar 

  26. Taylor, W.R.: A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161–169 (1988)

    Article  Google Scholar 

  27. Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties, and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  Google Scholar 

  28. Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research 27, 2682–2690 (1999)

    Article  Google Scholar 

  29. Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Armano, G., Milanesi, L., Orro, A. (2005). Using Secondary Structure Information to Perform Multiple Alignment. In: Priami, C., Merelli, E., Gonzalez, P., Omicini, A. (eds) Transactions on Computational Systems Biology III. Lecture Notes in Computer Science(), vol 3737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11599128_6

Download citation

  • DOI: https://doi.org/10.1007/11599128_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30883-6

  • Online ISBN: 978-3-540-31446-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics