Artificial Intelligence Review

, Volume 42, Issue 3, pp 445–459 | Cite as

Going over the three dimensional protein structure similarity problem

  • Nantia Iakovidou
  • Eleftherios Tiakas
  • Konstantinos Tsichlas
  • Yannis Manolopoulos


This article presents in detail our novel proposed methodology for detecting similarity between or among three dimensional protein structures. The innovation of our algorithm relies on the fact that during the similarity process, it has the ability to combine many attributes together and fulfill lots of preconditions, which are extensively discussed throughout the paper. Our concept is also supported by an efficient and effective indexing scheme, that provides convincing results comparing to other known methods.


Protein structure similarity Indexing scheme Combined linear measure 


  1. Alexandrov NN (1996) SARFing the PDB. Protein Eng 9:727–732CrossRefGoogle Scholar
  2. Bachar O, Fischer D, Nussinov R, Wolfson H (1993) A computer vision based technique for 3D sequence-independent structural comparison of proteins. Protein Eng 6:279–288CrossRefGoogle Scholar
  3. Bashton M, Chothia C (2007) The generation of new protein functions by the combination of domains. Structure 15:85–99CrossRefGoogle Scholar
  4. Berman HM et al (2007) The protein data bank. Nucleic Acids Res 28:235–242CrossRefGoogle Scholar
  5. Budowski-Tal I, Nov Y, Kolodny R (2010) FragBag, an accurate representation of protein structure, retrieves stuctural neighbors from the entire PDB quickly and accurately. Proc Natl Acad Sci USA 107:3481–3486CrossRefGoogle Scholar
  6. Can T, Wang YF (2004) Protein structure alignment and fast similarity search using local shape signatures. J Bioinform Comput Biol 2:215–239CrossRefGoogle Scholar
  7. Carugo O, Pongor S (2002) Protein fold similarity estimated by a probabilistic approach based on C(alpha)-C(alpha) distance comparison. J Mol Biol 315:887–898CrossRefGoogle Scholar
  8. Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd international conference on very large databases (VLDB)Google Scholar
  9. Cohen FE, Sternberg MJE (1980) Use of chemically derived distance constraints in the prediction of protein structure with myoglobin as an example. J Mol Biol 137:9–22CrossRefGoogle Scholar
  10. Dror O, Benyamini H, Nussinov R, Wolfson HJ (2003) Multiple structural alignment by secondary structures: algorithm and applications. Protein Sci 12:2492–507CrossRefGoogle Scholar
  11. Fischer D, Elofsson A, Rice D, Eisenberg D (1996) Assessing the performance of fold recognition methods by means of a comprehensive benchmark. In: Pacific symposium on biocomputing, pp 300–318Google Scholar
  12. Fong JH, Geer LY, Panchenko AR, Bryant SH (2007) Modeling the evolution of protein domain architectures using maximum parsimony. J Mol Biol 366:307–315CrossRefGoogle Scholar
  13. Gan HH et al (2002) Analysis of protein sequence/structure similarity relationships. Biophys J 83:2781–2791CrossRefGoogle Scholar
  14. Gibrat JF, Madej T, Bryant SH (1996) Surprising similarities in structure comparison. Curr Opin Struct Biol 6:377–385CrossRefGoogle Scholar
  15. Griep S, Hobohm U (2010) PDBselect 1992–2009 and PDBfilter-select. Nucleic Acids Res Database Issue 38:318–319Google Scholar
  16. Guerler A, Knapp EW (2008) Novel protein folds and their nonsequential structural analogs. Protein Sci 17:1374–1382CrossRefGoogle Scholar
  17. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD conference, p 4757Google Scholar
  18. Holm L, Sander C (1993) Protein structure comparison by alignment of distance matrices. J Mol Biol 233:123–138CrossRefGoogle Scholar
  19. Koehl P (2001) Protein structure similarities. Curr Opin Struct Biol 11:348–353CrossRefGoogle Scholar
  20. Kolbeck B, May P, Schmidt-Goenner T, Steinke T, Knapp EW (2006) Connectivity independent protein-structure alignment. BMC Bioinform 7:510–510CrossRefGoogle Scholar
  21. Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr 60:2256–2268Google Scholar
  22. Konagurthu AS, Whisstock JC, Stuckey PJ, Lesk AM (2006) MUSTANG: a multiple structural alignment algorithm, proteins: structures. Funct Bioinform 64:559–574CrossRefGoogle Scholar
  23. Lesk AM (2004) Introduction to protein science: architecture, function and genomics. Oxford University Press, OxfordGoogle Scholar
  24. Lichtarge O, Sowa ME (2002) Evolutionary predictions of binding surfaces and interactions. Curr Opin Struct Biol 12:21–27CrossRefGoogle Scholar
  25. Lupyan D, Leo-Macias A, Ortiz AR (2005) A new progressive-iterative algorithm for multiple structure alignment. Bioinformatics 21:3255–3263CrossRefGoogle Scholar
  26. Madej T, Gibrat JF, Bryant SH (1995) Threading a database of protein cores. Proteins 23:356–369CrossRefGoogle Scholar
  27. Micheletti C, Orland H (2009) MISTRAL: a tool for energy-based multiple structural alignment of proteins. Oxf Univ Press 20:2663–9Google Scholar
  28. Mosimann SC, Ardelt W, James MNG (1994) Refined 1.7 a X-ray crystallographic structure of P-30 protein, an amphibian ribonuclease with anti-tumor activity. J Mol Biol 236:1141–1153CrossRefGoogle Scholar
  29. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453CrossRefGoogle Scholar
  30. Ortiz AR, Strauss CEM, Olmea O (2002) MAMMOTH: an automated method for model comparison. Protein Sci 11:2606–2621CrossRefGoogle Scholar
  31. Park C, Park S, Kim D, Park S, Sung M, Lee H, Shin J, Hwang C (2006) Fast protein structure alignment algorithm based on local geometric similarity. In: MICAI 2006, LNAI 4293, pp 1179–1189Google Scholar
  32. Potestio R, Aleksiev T, Pontiggia F, Cozzini S, Micheletti C (2010) ALADYN: a web server for aligning proteins by matching their large-scale motion. Nucleic Acids Res 38:W41–W45CrossRefGoogle Scholar
  33. Rogen P, Fain B (2003) Automatic classification of protein structure by using Gauss integrals. Proc Natl Acad Sci 100:119–124CrossRefGoogle Scholar
  34. Shapiro J, Brutlag D (2004) FoldMiner: structural motif discovery using an improved superposition algorithm. Protein Sci 13:278–294CrossRefGoogle Scholar
  35. Shatsky M, Nussinov R, Wolfson HJ (2004) A method for simultaneous alignment of multiple protein structures. Proteins 56:143–156CrossRefGoogle Scholar
  36. Shindyalov IN, Bourne PE (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 11:739–747CrossRefGoogle Scholar
  37. Stivala AD, Stuckey PJ, Wirth AI (2010) Fast and accurate protein substructure searching with simulated annealing and GPUs. BMC Bioinform 11:446–463CrossRefGoogle Scholar
  38. Traina C, Traina AJM, Seeger B, Faloutsos C (2000) Slim-trees: high performance metric trees minimizing overlap between nodes. In: Proceedings of the seventh international conference on extending database technology (EDBT), pp 51–65Google Scholar
  39. Veeramalai M, Ye Y, Godzik A (2008) TOPS++FATCAT: fast flexible structural alignment using constraints derived from TOPS+ Strings Model. BMC Bioinformatics 9:358CrossRefGoogle Scholar
  40. Xie L, Bourne PE (2008) Detecting evolutionary relationships across existing fold space. Proc Natl Acad Sci USA 105:5441–5446CrossRefGoogle Scholar
  41. Ye Y, Godzik A (2003) Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19:246–255Google Scholar
  42. Yuan X, Bystroff C (2005) Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins. Bioinformatics 21:1010–1019CrossRefGoogle Scholar
  43. Zen A, Carnevale V, Lesk AM, Micheletti C (2008) Correspondences between low-energy modes in enzymes: dynamics-based alignment of enzymatic functional families. Protein Sci 17:918–929CrossRefGoogle Scholar
  44. Zhi D, Krishna S, Cao H, Pevzner P, Godzik A (2006) Representing and comparing protein structures as paths in three-dimensional space. BMC Bioinform 7:460–475CrossRefGoogle Scholar
  45. Zhang L, Bailey J, Konagurthu AS, Ramamohanarao K (2010) A fast indexing approach for protein structure comparison. BMC Bioinform 11:S46CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Nantia Iakovidou
    • 1
  • Eleftherios Tiakas
    • 1
  • Konstantinos Tsichlas
    • 1
  • Yannis Manolopoulos
    • 1
  1. 1.Department of InformaticsAristotle University of ThessalonikiThessaloníkiGreece

Personalised recommendations