Theoretical and Computational Aspects of Protein Structural Alignment

  • Paweł Daniluk
  • Bogdan Lesyng
Part of the Springer Series in Bio-/Neuroinformatics book series (SSBN, volume 1)


Computing alignments of proteins based on their structure is one of the fundamental tasks of bioinformatics. It is crucial in all kinds of comparative analysis as well as in performing evolutionary and functional classification. Whereas determination of sequence relationships is well founded in statistical models, there is still considerable uncertainty over how to describe geometric relationships between proteins. Continuous growth of structural databases calls for fast and reliable algorithmic methods, enabling one to effectively compute alignments of pairs and larger sets of protein molecules. Although such methodologies have been developed over the past two decades, there exist so-called “difficult similarities” which may include repeats, insertions or deletions, permutations, and conformational changes. A brief overview of existing methodologies with emphasis on different approaches to decomposition of structures into smaller fragments is followed by a presentation of a formalism of local descriptors of protein structures. A formal definition of the problem of computing optimal alignments accommodating aforementioned difficulties is presented along with an analysis of the computational complexity of its important variants. Examples of “difficult similarities” and practical aspects of protein structure comparison are discussed.


Multiple Alignment Local Similarity Maximal Clique Structural Alignment Local Descriptor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alexandrov, N.: SARFing the PDB. Protein Engineering 9(9), 727 (1996)CrossRefGoogle Scholar
  2. 2.
    Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)CrossRefGoogle Scholar
  3. 3.
    Anand, B., Verma, S.K., Prakash, B.: Structural stabilization of GTP-binding domains in circularly permuted GTPases: implications for RNA binding. Nucleic Acids Res. 34(8), 2196–2205 (2006)CrossRefGoogle Scholar
  4. 4.
    Bachar, O., Fischer, D., Nussinov, R., Wolfson, H.: A computer vision based technique for 3-D sequence-independent structural comparison of proteins. Protein Eng. 6(3), 279–288 (1993)CrossRefGoogle Scholar
  5. 5.
    Barrientos, L.G., Louis, J.M., Botos, I., Mori, T., Han, Z., O’Keefe, B.R., Boyd, M.R., Wlodawer, A., Gronenborn, A.M.: The domain-swapped dimer of cyanovirin-N is in a metastable folded state: reconciliation of X-ray and NMR structures. Structure 10(5), 673–686 (2002)CrossRefGoogle Scholar
  6. 6.
    Berbalk, C., Schwaiger, C.S., Lackner, P.: Accuracy analysis of multiple structure alignments. Protein Sci. 18(10), 2027–2035 (2009)CrossRefGoogle Scholar
  7. 7.
    Bewley, C.A., Gustafson, K.R., Boyd, M.R., Covell, D.G., Bax, A., Clore, G.M., Gronenborn, A.M.: Solution structure of cyanovirin-N, a potent HIV-inactivating protein. Nat. Struct. Biol. 5(7), 571–578 (1998)CrossRefGoogle Scholar
  8. 8.
    Bystroff, C., Baker, D.: Prediction of local structure in proteins using a library of sequence-structure motifs. J. Mol. Biol. 281(3), 565–577 (1998)CrossRefGoogle Scholar
  9. 9.
    Daniluk, P., Lesyng, B.: DAMA: a novel method for aligning multiple protein structures. In: Multi-Pole Approach to Structural Biology Conference, Warsaw, Poland (2011)Google Scholar
  10. 10.
    Daniluk, P., Lesyng, B.: A novel method to compare protein structures using local descriptors. BMC Bioinformatics 12(1), 344 (2011)CrossRefGoogle Scholar
  11. 11.
    Daniluk, P., Dziubiński, M., Lesyng, B., Hallay-Suszek, M., Rakowski, F., Walewski, Ł.: From experimental, structural probability distributions to the theoretical causality analysis of molecular changes. Computer Assisted Methods in Engineering and Science 19(3), 257–276 (2012)Google Scholar
  12. 12.
    Dobbins, S., Lesk, V., Sternberg, M.: Insights into protein flexibility: The relationship between normal modes and conformational change upon protein–protein docking. Proceedings of the National Academy of Sciences 105(30), 10,390 (2008)CrossRefGoogle Scholar
  13. 13.
    Dror, O., Benyamini, H., Nussinov, R., Wolfson, H.: MASS: multiple structural alignment by secondary structures. Bioinformatics 19(suppl. 1), 95–104 (2003)CrossRefGoogle Scholar
  14. 14.
    Elias, I.: Settling the intractability of multiple alignment. J. Comput. Biol. 13(7), 1323–1339 (2006)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Garey, M.R., Johnson, D.S.: Computers and intractability: a guide to the theory of NP-completeness. A Series of books in the mathematical sciences. W. H. Freeman, San Francisco (1979)zbMATHGoogle Scholar
  16. 16.
    Gerstein, M., Echols, N.: Exploring the range of protein flexibility, from a structural proteomics perspective. Current Opinion in Chemical Biology 8(1), 14–19 (2004)CrossRefGoogle Scholar
  17. 17.
    Gibrat, J.F., Madej, T., Bryant, S.H.: Surprising similarities in structure comparison. Curr. Opin. Struct. Biol. 6(3), 377–385 (1996)CrossRefGoogle Scholar
  18. 18.
    Grishin, N.V.: Fold change in evolution of protein structures. J. Struct. Biol. 134(2-3), 167–185 (2001)CrossRefGoogle Scholar
  19. 19.
    Guerler, A., Knapp, E.W.: Novel protein folds and their nonsequential structural analogs. Protein Sci. 17(8), 1374–1382 (2008)CrossRefGoogle Scholar
  20. 20.
    Holm, L., Park, J.: DaliLite workbench for protein structure comparison. Bioinformatics 16(6), 566–567 (2000)CrossRefGoogle Scholar
  21. 21.
    Holm, L., Sander, C.: Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233(1), 123–138 (1993)CrossRefGoogle Scholar
  22. 22.
    Ilyin, V.A., Abyzov, A., Leslin, C.M.: Structural alignment of proteins by a novel TOPOFIT method, as a superimposition of common volumes at a topomax point. Protein Sci. 13(7), 1865–1874 (2004)CrossRefGoogle Scholar
  23. 23.
    Jung, J., Lee, B.: Protein structure alignment using environmental profiles. Protein Eng. 13(8), 535–543 (2000)CrossRefGoogle Scholar
  24. 24.
    Kabsch, W.: A solution for the best rotation to relate two sets of vectors. Acta Crystallographica. Section A 32(5), 922–923 (1976)CrossRefGoogle Scholar
  25. 25.
    Kabsch, W.: A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallographica. Section A 34(5), 827–828 (1978)CrossRefGoogle Scholar
  26. 26.
    Kawabata, T., Nishikawa, K.: Protein structure comparison using the markov transition model of evolution. Proteins 41(1), 108–122 (2000)CrossRefGoogle Scholar
  27. 27.
    Kervinen, J., Tobin, G.J., Costa, J., Waugh, D.S., Wlodawer, A., Zdanov, A.: Crystal structure of plant aspartic proteinase prophytepsin: inactivation and vacuolar targeting. EMBO J. 18(14), 3947–3955 (1999)CrossRefGoogle Scholar
  28. 28.
    Konagurthu, A.S., Whisstock, J.C., Stuckey, P.J., Lesk, A.M.: MUSTANG: a multiple structural alignment algorithm. Proteins 64(3), 559–574 (2006)CrossRefGoogle Scholar
  29. 29.
    Liepinsh, E., Andersson, M., Ruysschaert, J.M., Otting, G.: Saposin fold revealed by the NMR structure of NK-lysin. Nat. Struct. Biol. 4(10), 793–795 (1997)CrossRefGoogle Scholar
  30. 30.
    Lindqvist, Y., Schneider, G.: Circular permutations of natural protein sequences: structural evidence. Curr. Opin. Struct. Biol. 7(3), 422–427 (1997)CrossRefGoogle Scholar
  31. 31.
    Mavridis, L., Ritchie, D.: 3D-blast: 3D protein structure alignment, comparison, and classification using spherical polar fourier correlations. In: Pacific Symposium on Biocomputing, vol. 2010, pp. 281–292 (2010)Google Scholar
  32. 32.
    Mayr, G., Domingues, F.S., Lackner, P.: Comparative analysis of protein structure alignments. BMC Struct. Biol. 7, 50 (2007)CrossRefGoogle Scholar
  33. 33.
    Menke, M., Berger, B., Cowen, L.: Matt: local flexibility aids protein multiple structure alignment. PLoS Comput. Biol. 4(1), e10 (2008)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E.: Equation of state calculations by fast computing machines. The Journal of Chemical Physics 21(6), 1087 (1953)CrossRefGoogle Scholar
  35. 35.
    Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)CrossRefGoogle Scholar
  36. 36.
    Niemann, H.H., Knetsch, M.L., Scherer, A., Manstein, D.J., Kull, F.J.: Crystal structure of a dynamin GTPase domain in both nucleotide-free and GDP-bound forms. EMBO J. 20(21), 5813–5821 (2001)CrossRefGoogle Scholar
  37. 37.
    Orengo, C.A., Taylor, W.R.: SSAP: sequential structure alignment program for protein structure comparison. Methods Enzymol. 266, 617–635 (1996)CrossRefGoogle Scholar
  38. 38.
    Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Theory and decision library: System theory, knowledge engineering, and problem solving. Kluwer Academic Publishers (1991)Google Scholar
  39. 39.
    Pearson, W., Lipman, D.: Improved tools for biological sequence comparison. Proceedings of the National Academy of Sciences 85(8), 2444 (1988)CrossRefGoogle Scholar
  40. 40.
    Ponting, C.P., Russell, R.B.: Swaposins: circular permutations within genes encoding saposin homologues. Trends Biochem. Sci. 20(5), 179–180 (1995)CrossRefGoogle Scholar
  41. 41.
    Rocha, J., Segura, J., Wilson, R.C., Dasgupta, S.: Flexible structural protein alignment by a sequence of local transformations. Bioinformatics 25(13), 1625–1631 (2009)CrossRefGoogle Scholar
  42. 42.
    Salem, S., Zaki, M., Bystroff, C.: FlexSnap: Flexible Non-sequential Protein Structure Alignment. Algorithms for Molecular Biology 5(1), 12 (2010)CrossRefGoogle Scholar
  43. 43.
    Shatsky, M., Nussinov, R., Wolfson, H.J.: FlexProt: alignment of flexible protein structures without a predefinition of hinge regions. J. Comput. Biol. 11(1), 83–106 (2004)CrossRefGoogle Scholar
  44. 44.
    Shatsky, M., Nussinov, R., Wolfson, H.J.: A method for simultaneous alignment of multiple protein structures. Proteins 56(1), 143–156 (2004)CrossRefGoogle Scholar
  45. 45.
    Shin, D.H., Lou, Y., Jancarik, J., Yokota, H., Kim, R., Kim, S.H.: Crystal structure of YjeQ from Thermotoga maritima contains a circularly permuted GTPase domain. Proc. Natl. Acad. Sci. U S A 101(36), 13,198–203 (2004)CrossRefGoogle Scholar
  46. 46.
    Shindyalov, I.N., Bourne, P.E.: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 11(9), 739–747 (1998)CrossRefGoogle Scholar
  47. 47.
    Siew, N., Elofsson, A., Rychlewski, L., Fischer, D.: MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 16(9), 776–785 (2000)CrossRefGoogle Scholar
  48. 48.
    Swendsen, R.H., Wang, J.S.: Replica Monte Carlo simulation of spin glasses. Phys. Rev. Lett. 57(21), 2607–2609 (1986)MathSciNetCrossRefGoogle Scholar
  49. 49.
    Vogel, C., Morea, V.: Duplication, divergence and formation of novel protein topologies. Bioessays 28(10), 973–978 (2006)CrossRefGoogle Scholar
  50. 50.
    Wohlers, I., Domingues, F.S., Klau, G.W.: Towards optimal alignment of protein structure distance matrices. Bioinformatics 26(18), 2273–2280 (2010)CrossRefGoogle Scholar
  51. 51.
    Yang, F., Bewley, C.A., Louis, J.M., Gustafson, K.R., Boyd, M.R., Gronenborn, A.M., Clore, G.M., Wlodawer, A.: Crystal structure of cyanovirin-N, a potent HIV-inactivating protein, shows unexpected domain swapping. J. Mol. Biol. 288(3), 403–412 (1999)CrossRefGoogle Scholar
  52. 52.
    Ye, Y., Godzik, A.: Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19(suppl. 2), ii246–255 (2003)Google Scholar
  53. 53.
    Ye, Y., Godzik, A.: Multiple flexible structure alignment using partial order graphs. Bioinformatics 21(10), 2362–2369 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Paweł Daniluk
    • 1
    • 2
  • Bogdan Lesyng
    • 1
    • 2
  1. 1.Department of Biophysics, Faculty of PhysicsUniversity of WarsawWarsawPoland
  2. 2.Bioinformatics LaboratoryMossakowski Medical Research CentreWarsawPoland

Personalised recommendations