Using Secondary Structure Information to Perform Multiple Alignment

Armano, Giuliano; Milanesi, Luciano; Orro, Alessandro

doi:10.1007/11599128_6

Giuliano Armano²³,
Luciano Milanesi²⁴ &
Alessandro Orro²³

Part of the book series: Lecture Notes in Computer Science ((TCSB,volume 3737))

Abstract

In this paper an approach devised to perform multiple alignment is described, able to exploit any available secondary structure information. In particular, given the sequences to be aligned, their secondary structure (either available or predicted) is used to perform an initial alignment –to be refined by means of locally-scoped operators entrusted with “rearranging” the primary level. Aimed at evaluating both the performance of the technique and the impact of “true” secondary structure information on the quality of alignments, a suitable algorithm has been implemented and assessed on relevant test cases. Experimental results point out that the proposed solution is particularly effective when used to align low similarity protein sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barton, G.J., Sternberg, M.E.J.: A Strategy for the Rapid Multiple Alignment of Protein Sequences. Confidence Levels from Tertiary Structure Comparisons. J. Mol. Biol. 198, 327–337 (1987)
Article Google Scholar
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)
Article Google Scholar
Carrillo, H., Lipman, D.J.: The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 48, 1073–1082 (1988)
Article MATH MathSciNet Google Scholar
Devereux, J., Haeberli, P., Smithies, O.: GCG package. Nucleic Acids Research 12, 387–395 (1984)
Article Google Scholar
Eddy, S.R.: Multiple alignment using hidden Markov models. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 114–120 (1995)
Google Scholar
Feng, D.F., Doolittle, R.F.: Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360 (1987)
Article Google Scholar
Hogeweg, P., Hesper, B.: The alignment of sets of sequences and the construction of phylogenetic trees, an integrated method. J. Mol. Evol. 20, 175–186 (1984)
Article Google Scholar
Giunchiglia, F., Villafiorita, A., Walsh, T.: Theories of Abtraction. AI Communications 10, 167–176 (1997)
Google Scholar
Gotoh, O.: Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments. J. Mol. Biol. 264, 823–838 (1996)
Article Google Scholar
Heringa, J.: Two strategies for sequence comparison: profile preprocessed and secondary structure-induced multiple alignment. Computers and Chemistry 23, 341–364 (1999)
Article Google Scholar
Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)
Article Google Scholar
Knoblock, C.A., Tenenberg, J.D., Yang, Q.: Characterizing Abstraction Hierarchies for Planning. In: Proc. of the Ninth National Conference on Artificial Intelligence, vol. 2, pp. 692–697 (1991)
Google Scholar
Krogh, A., Brown, M., Mian, I.S., Sjlander, K., Haussler, D.: Hidden Markov Models in Computational Biology: Applications to Protein Modeling. J. Mol. Biol. 235, 1501–1531 (1994)
Article Google Scholar
Morgenstern, B., Dress, A., Werner, T.: Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12098–12103 (1996)
Article MATH Google Scholar
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)
Article Google Scholar
Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996)
Article Google Scholar
Notredame, C., Holm, L., Higgins, D.G.: COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422 (1998)
Article Google Scholar
Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: A Novel Method for Fast and Accurate Multiple Sequence Alignment. J. Mol. Biol. 302, 205–217 (2000)
Article Google Scholar
Notredame, C.: Recent Progresses in Multiple Sequence Alignment: a Survey. Pharmaco-genomics 3(1), 131–144 (2002)
Google Scholar
Plaisted, D.: Theorem Proving with Abstraction. Artificial Intelligence 16(1), 47–108 (1981)
Article MATH MathSciNet Google Scholar
Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the Prediction of Protein Secondary Structure in Three and Eight Classes Using Recurrent Neural Networks and Profiles. Proteins 47, 228–235 (2002)
Article Google Scholar
Prlic, A., Domingues, F.S., Sippl, M.J.: Structure-derived substitution matrices for alignment of distantly related sequences. Protein Eng. 13, 545–550 (2000)
Article Google Scholar
Rost, B.: Twilight zone of protein sequence alignments. Protein Engineering 12(2), 85–94 (1999)
Article MathSciNet Google Scholar
Saitta, L., Zucker, J.D.: Semantic Abstraction for Concept Representation and Learning. In: Symposium on Abstraction, Reformulation and Approximation (SARA 1998), Pacific Grove, California, pp. 103–120 (1998)
Google Scholar
Smith, R.F., Smith, T.F.: Pattern-Induced Multi-sequence Alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. Protein Eng. 5(1), 35–41 (1992)
Article Google Scholar
Taylor, W.R.: A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161–169 (1988)
Article Google Scholar
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties, and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)
Article Google Scholar
Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research 27, 2682–2690 (1999)
Article Google Scholar
Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Cagliari, Piazza d’Armi, I-09123, Cagliari, Italy
Giuliano Armano & Alessandro Orro
ITB-CNR, Via Fratelli Cervi, 93, 20090, Segrate Milano, Italy
Luciano Milanesi

Authors

Giuliano Armano
View author publications
You can also search for this author in PubMed Google Scholar
Luciano Milanesi
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Orro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Computational and Systems Biology, The Microsoft Research - University of Trento, Piazza Manci, 17, 38050, Povo (TN), Italy
Corrado Priami
Dipartimento di Matematica e Informatica, Università di Camerino, I-62032, Camerino, Italy
Emanuela Merelli
DEIS, Università di Bologna, Via Venezia 52, 47023, Cesena, Italy
Pablo Gonzalez
Alma Mater Studiorum Università di Bologna a Cesena,, via Venezia 52, 47023, Cesena, Italy
Andrea Omicini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Armano, G., Milanesi, L., Orro, A. (2005). Using Secondary Structure Information to Perform Multiple Alignment. In: Priami, C., Merelli, E., Gonzalez, P., Omicini, A. (eds) Transactions on Computational Systems Biology III. Lecture Notes in Computer Science(), vol 3737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11599128_6

Download citation

DOI: https://doi.org/10.1007/11599128_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30883-6
Online ISBN: 978-3-540-31446-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics