Algorithms for Multiple Protein Structure Alignment and Structure-Derived Multiple Sequence Alignment

  • Maxim Shatsky
  • Ruth Nussinov
  • Haim J. Wolfson
Part of the Methods in Molecular Biology™ book series (MIMB, volume 413)


Primary amino acid content and the geometry of the folded protein 3D structure are major parameters of protein function. During the course of evolution the protein 3D structure is more preserved than its primary sequence. Thus, analysis of protein structures is expected to lead to a deep insight into protein function. Recognition of a structural core common to a set of protein structures serves as a basic tool for the studies of protein evolution and classification, analysis of similar structural motifs and functional binding sites, and for homology modeling and threading.

In this chapter, we discuss several biologically related computational aspects of the multiple structure alignment and propose a method that provides solutions to these problems. Finally, we address the problem of structure-based multiple sequence alignment and propose an optimization method that unifies primary sequence and 3D structure information.


Multiple structure alignment partial alignment structure base sequence alignment structure-sequence conservation 



The research of M. Shatsky is supported by a PhD fellowship in “Complexity Science“ from the Yeshaya Horowitz association. This research was supported by the Israel Science Foundation (grant no. 281/05), the Binational US-Israel Science Foundation (BSF) and by the Hermann Minkowski-Minerva Center for Geometry at Tel Aviv University. The research of R. Nussinov has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under contract number NO1-CO-12400. This research was supported (in part) by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research. The content of this publication does not necessarily reflect the view or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organization imply endorsement by the U.S. Government.


  1. 1.
    Madej, T., Gibrat, J., and Bryant, S. Threading a database of protein cores. Proteins 23:356–369, 1995. Online available on Scholar
  2. 2.
    Bachar, O., Fischer, D., Nussinov, R., and Wolfson, H. A computer vision based technique for 3-D sequence independent structural comparison. Protein Eng 6:279–288, 1993.CrossRefPubMedGoogle Scholar
  3. 3.
    Shindyalov, I. and Bourne, P. Protein structure alignment by incremental combinatorical extension (ce) of the optimal path. Protein Eng 11(9):739–747, 1998. Online available on Scholar
  4. 4.
    Dietmann, S., Park, J., Notredame, C., Heger, A., Lappe, M., and Holm, L. A fully automatic evolutionary classification of protein folds: dali domain dictionary version 3. Nucleic Acids Res 29(1):55–57, 2001. Online available on Scholar
  5. 5.
    Shatsky, M., Nussinov, R., and Wolfson, H. FlexProt: alignment of flexible protein structures without a pre-definition of hinge regions. Journal of Computational Biology 11(1):83–106, 2004.CrossRefPubMedGoogle Scholar
  6. 6.
    Eidhammer, I., Jonassen, I., and Taylor, W. Structure comparison and structure patterns. J Comput Biol 7:685–716, 2000.CrossRefPubMedGoogle Scholar
  7. 7.
    Orengo, C.A., Michie, A.D., Jones, S., Jones D.T., Swindells, M. B., and Thornton, J. M. CATH – a hierarchic classification of protein domain structure. Structure 5(8):1093–1108, 1997.CrossRefPubMedGoogle Scholar
  8. 8.
    Ma, B., Elkayam, T., Wolfson, H., and Nussinov, R. Protein-protein interactions: structurally conserved residues distinguish between binding sites and exposed protein surfaces. Proc Natl Acad Sci USA 100(10):5772–5777, 2003.CrossRefPubMedGoogle Scholar
  9. 9.
    Chung, J., Wang, W., and Bourne, P. Exploiting sequence and structure homologs to identify protein-protein binding sites. Proteins 62(3):630–640, 2006.CrossRefPubMedGoogle Scholar
  10. 10.
    Aytuna, A., Gursoy, A., and Keskin, O. Prediction of protein-protein interactions by combining structure and sequence conservation in protein interfaces. Bioinformatics 21(12):2850–2855, 2005.CrossRefPubMedGoogle Scholar
  11. 11.
    Akutsu, T. and Sim, K. Protein threading based on multiple protein structure alignment. In Genome Informatics (GIW’99), Asai, K. and Miyano, S. and Takagi, T (eds). Universal Academy Press, Tokyo, 23–29, 1999.Google Scholar
  12. 12.
    Goldsmith-Fischman, S. and Honig, B. Structural genomics: computational methods for structure analysis. Protein Sci 12(9):1813–1821, 2003.CrossRefPubMedGoogle Scholar
  13. 13.
    Koehl, P. Protein structure similarities. Curr Opin Struct Biol 11:348–353, 2001.CrossRefPubMedGoogle Scholar
  14. 14.
    Kolodny, R., Koehl, P., and Levitt, M. Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J Mol Biol 346(4):1173–88, 2005.CrossRefPubMedGoogle Scholar
  15. 15.
    Bennett, M., Schlunegger, M., and Eisenberg, D. 3d domain swapping: a mechanism for oligomer assembly. Protein Sci 4:2455–2468, 1995.CrossRefPubMedGoogle Scholar
  16. 16.
    Dror, O., Benyamini, H., Nussinov, R., and Wolfson, H. MASS: multiple structural alignment by secondary structures. Bioinformatics 19 Suppl. 1:i95–i104, 2003.Google Scholar
  17. 17.
    Yuan, X. and Bystroff, C. Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins. Bioinformatics 21(7):1010–1019, 2005.CrossRefPubMedGoogle Scholar
  18. 18.
    Ambuhl, C., Chakraborty, S., and Gartner, B. Computing largest common point sets under approximate congruence. In Proceedings of the 8th Annual European Symposium on Algorithms, 52–63, Springer-Verlag, Springer, Berlin, 2000.Google Scholar
  19. 19.
    Akutsu, T. Protein structure alignment using dynamic programming and iterative improvement. IEICE Trans Inf Syst E79-D:1629–1636, Springer Berlin, 1996.Google Scholar
  20. 20.
    Kolodny, R. and Linial, N. Approximate protein structural alignment in polynomial time. Proc Natl Acad Sci USA 101(33):12201–12206, 2004.CrossRefPubMedGoogle Scholar
  21. 21.
    Shatsky, M., Shulman-Peleg, A., Nussinov, R., and Wolfson, H. The multiple common point set problem and its application to molecule binding pattern detection. J Comput Biol 13(2):407–428, 2006.CrossRefPubMedGoogle Scholar
  22. 22.
    Edgar, R. Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797, 2004.CrossRefPubMedGoogle Scholar
  23. 23.
    Gerstein, M. and Levitt, M. Using iterative dynamic programming to obtain accurate pairwise and multiple alignments of protein structures. In Proceedings of the Fourth International Conference on Intelligent Systems in Molecular Biology, 59–67, Menlo Park, CA, AAAI Press, Heidleberg, Germany, 1996.Google Scholar
  24. 24.
    Russell, R. and Barton, G. Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels. Proteins 14:309–323, 1992.CrossRefPubMedGoogle Scholar
  25. 25.
    Taylor, W. R., Flores, T., and Orengo, C. Multiple protein structure alignment. Protein Sci 3:1858–1870, 1994.CrossRefPubMedGoogle Scholar
  26. 26.
    Ye, Y. and Godzik, A. Multiple flexible structure alignment using partial order graphs. Bioinformatics 21(10):2362–2369, 2005.CrossRefPubMedGoogle Scholar
  27. 27.
    Ochagavia, M. E. and Wodak S. Progressive combinatorial algorithm for multiple structural alignments: application to distantly related proteins. Proteins 55(2):436–454, 2004.CrossRefPubMedGoogle Scholar
  28. 28.
    Konagurthu, A., Whisstock, J., Stuckey, P., and Lesk, A. Mustang: a multiple structural alignment algorithm. Proteins 64(3):559–574, 2006.CrossRefPubMedGoogle Scholar
  29. 29.
    Leibowitz, N., Nussinov, R., and Wolfson, H. MUSTA-a general, efficient, automated method for multiple structure alignment and detection of common motifs: application to proteins. J Comput Biol 8:93–121, 2001.CrossRefPubMedGoogle Scholar
  30. 30.
    Leibowitz, N., Fligelman, Z., Nussinov, R., and Wolfson, H. Automated multiple structure alignment and detection of a common substructural motif. Proteins 43:235–245, 2001.CrossRefPubMedGoogle Scholar
  31. 31.
    Wolfson, H. J. and Rigoutsos, I. Geometric hashing: an overview. IEEE Comput Sci Eng 4(4):10–21, 1997.CrossRefGoogle Scholar
  32. 32.
    Nussinov, R. and Wolfson, H. Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques. Proc Natl Acad Sci USA 88:10495–10499, 1991.CrossRefPubMedGoogle Scholar
  33. 33.
    Shatsky, M., Fligelman, Z., Nussinov, R., and Wolfson, H. Alignment of flexible protein structures. In 8th International Conference on Intelligent Systems for Molecular Biology, 329–343, AAAI press, Heidleberg, Germany, 2000.Google Scholar
  34. 34.
    Jonassen, I., Eidhammer, I., Conklin, D., and Taylor, W. Structure motif discovery and mining the pdb. Bioinformatics 18(2):362–367, 2002.CrossRefPubMedGoogle Scholar
  35. 35.
    Dror, O., Benyamini, H., Nussinov, R., and Wolfson, H. Multiple structural alignment by secondary structures: – algorithm and applications. Protein Sci 12:2492–2507, 2003.CrossRefPubMedGoogle Scholar
  36. 36.
    O’Sullivan, O., Suhre, K., Abergel, C., Higgins, D., and Notredame, C. 3Dcoffee: combining protein sequences and structures within multiple sequence alignments. J Mol Biol 340(2):385–395, 2004.CrossRefPubMedGoogle Scholar
  37. 37.
    Shatsky, M., Nussinov, R., and Wolfson, H. A method for simultaneous alignment of multiple protein structures. Proteins 56(1):143–156, 2004.CrossRefPubMedGoogle Scholar
  38. 38.
    Mizuguchi, K., Deane, C., Blundell, T., and Overington, J. Homstrad: a database of protein structure alignments for homologous families. Protein Sci 7:2469–2471, 1998.CrossRefPubMedGoogle Scholar
  39. 39.
    Akutsu, T. and Halldorson, M. M. On the approximation of largest common subtrees and largest common point sets. Theor Comput Sci 233:33–50, 2000.CrossRefGoogle Scholar
  40. 40.
    Murzin, A., Brenner, S., Hubbard, T., and Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540, 1995.PubMedGoogle Scholar
  41. 41.
    Chandonia, J., Hon, G., Walker, N., Lo Conte, L., Koehl, P., Levitt, M., and Brenner, S. The astral compendium in 2004. Nucleic Acids Res 32:D189–D192, 2004.CrossRefPubMedGoogle Scholar
  42. 42.
    Shatsky, M., Nussinov, R., and Wolfson, H. T. Optimization of multiple sequence alignment based on multiple structure alignment. Proteins 62(1):209–217, 2006.CrossRefPubMedGoogle Scholar
  43. 43.
    Henikoff, S. and Henikoff, J. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89(22):10915–10919, 1992.CrossRefPubMedGoogle Scholar
  44. 44.
    Holm, L. and Sander, C. Protein structure comparison by alignment of distance matrices. J Mol Biol 233:123–138, 1993.CrossRefPubMedGoogle Scholar
  45. 45.
    Zhang, Z., Lindstam, M., Unge, J., Peterson, C., and Lu, G. Potential for dramatic improvement in sequence alignment against structures of remote homologous proteins by extracting structural information from multiple structure alignment. J Mol Biol 332(1):127–142, 2003.CrossRefPubMedGoogle Scholar
  46. 46.
    Hubbard, S. and Till, J. H. Protein tyrosine kinase structure and function. Ann Rev Biochem 69:373–398, 2000.CrossRefPubMedGoogle Scholar
  47. 47.
    Higgins, D., Thompson, J., and Gibson, T. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680, 1994.CrossRefPubMedGoogle Scholar
  48. 48.
    Fischer, D., Elofsson, A., Rice, D., and Eisenberg, D. Assessing the performance of fold recognition methods by means of a comprehensive benchmark. In Proceedings of Pacific Symposium on Biocomputing (Hunter, L. and Klein, T., editors), World Scientific Press, Singapore, 300–318, 1996.Google Scholar

Copyright information

© Humana Press Inc 2008

Authors and Affiliations

  • Maxim Shatsky
  • Ruth Nussinov
  • Haim J. Wolfson

There are no affiliations available

Personalised recommendations