Skip to main content
Log in

Analogy-based protein structure prediction: I. A new database of spatially similar and dissimilar structures of protein domains for testing and optimizing prediction methods

  • Mathematical and Systemic Biology
  • Published:
Molecular Biology Aims and scope Submit manuscript

Abstract

The creation and analysis of the 3Dfold_test database are described. This database comprises a large set of pairs of spatially similar protein domain structures and a larger control set of “decoys,” spatially dissimilar protein structures with approximately the same size and compactness as each member of each pair. The database is available at http://phys.protres.ru/resources/prediction_analogy/3Dfold

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

AA:

is amino acid and

3D:

is three-dimensional

References

  1. Kopp J., Bordoli L., Battey J.N.D., Kiefer F., Schwede T. 2007. Assesment of CASP7 predictions for templatebased modeling targets. Proteins. 69, S8, 38–56.

    Article  Google Scholar 

  2. Jauch R., Yeo H.C., Kolatkar P.R., Clarke N.D. 2007. Assesment of CASP7 structure predictions for template free targets. Proteins. 69, S8, 38–67.

    Article  Google Scholar 

  3. Berman H., Henrick K., Nakamura H., Markley J.L. 2007. The worldwide Protein Data Bank (wwPDB): Ensuring a single, uniform archive of PDB data. Nucleic Acid Res. 35, D3010–D303; http://www.wwpdb.org.

    Google Scholar 

  4. Smith T.F., Waterman M.S. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197; http://fasta.bioch.virginia.edu/fasta-www2/fasta-www.cgi?rm=select&pgm=sw.

    Article  PubMed  CAS  Google Scholar 

  5. Altschul S.F., Gish W., Miller W., Myers E., Lipman D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403–410; ftp://ftp.ncbi.nih.gov/blast/executables/release/2.2.17/blast-2.2.17-ia32-linux.tar.gz.

    PubMed  CAS  Google Scholar 

  6. Altschul S.F., Madden T.L., Schäffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402; ftp://ftp.ncbi.nih.gov/blast/executables/release/2.2.17/blast-2.2.17-ia32-linux.tar.gz.

    Article  PubMed  CAS  Google Scholar 

  7. Finkelstein A.V., Reva B.A. 1990. Globular protein threading by a self-consisted field method. Biofizika. 35, 402–406.

    Google Scholar 

  8. Finkelstein A.V., Reva B.A. 1991. Search for the most stable folds of protein chains. Nature. 351, 497–499.

    Article  PubMed  CAS  Google Scholar 

  9. Bowie J.U., Lüthy R., Eisenberg D. 1991. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 253, 164–170.

    Article  PubMed  CAS  Google Scholar 

  10. Godzik A., Kolinski A., Skolnik J. 1992. Topology fingerprint approach to the inverse protein folding problem. J. Mol. Biol. 227, 227–238.

    Article  PubMed  CAS  Google Scholar 

  11. Jones D.T., Thornton J.M. 1996. Potential energy functions for threading. Curr. Opin. Struct. Biol. 6, 210–216.

    Article  PubMed  CAS  Google Scholar 

  12. Park B., Levitt M. 1996. Energy functions that discriminate X-ray and near-native folds from well-constructed decoys. J. Mol. Biol. 258, 367–392.

    Article  PubMed  CAS  Google Scholar 

  13. Samudrala R., Levitt M. 2000. Decoys ‘R’ us: A database of incorrect conformations to improve protein structure prediction. Protein Sci. 9, 1399–1401.

    Article  PubMed  CAS  Google Scholar 

  14. Reva B.A., Finkelstein A.V., Sanner M.F., Olson A.J. 1997. Residue-residue mean-force potentials for protein structure recognition. Protein Eng. 10, 865–876.

    Article  PubMed  CAS  Google Scholar 

  15. Taylor W.R. 2006. Decoy models for protein structure comparison score normalization. J. Mol. Biol. 357, 676–699.

    Article  PubMed  CAS  Google Scholar 

  16. Thompson J.D., Plewniak F., Poch O. 1999. BAliBASE: A benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics. 15, 87–88; http://bips.u-strasbg.fr/fr/Products/Databases/BAliBASE2/index.html.

    Article  PubMed  CAS  Google Scholar 

  17. Gough J., Chothia C. 2002. SUPERFAMILY: HMMs representing all proteins of known structure: SCOP sequence searches, alignments and genome assignments. Nucleic Acids Res. 30, 268–272; http://supfam.org.

    Article  PubMed  CAS  Google Scholar 

  18. Murzin A.G., Brenner S.E., Hubbard T., Chothia C. 1995. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540; http://scop.mrclmb.cam.ac.uk/scop/parse/index.html.

    PubMed  CAS  Google Scholar 

  19. Galzitskaya O.V., Reifsnyder D.C., Bogatyreva N.S., Ivankov D.N., Garbuzynskiy S.O. 2008. More compact protein globules exhibit slower folding rates. Proteins. 70, 329–332.

    Article  PubMed  CAS  Google Scholar 

  20. Siew N., Elofsson A., Rychlewski L., Fischer D. 2000. MaxSub: An automated measure for the assessment of protein structure prediction quality. Bioinformatics. 16, 776–785.

    Article  PubMed  CAS  Google Scholar 

  21. Lesk A.M. 1986. A toolkit for computational molecular biology: 2. On the optimal superposition of two sets of coordinates, Acta Crystallogr. A. 42, 110–113.

    Article  Google Scholar 

  22. Krieger E., Darden T., Nabuurs S.B., Finkelstein A., Vriend G. 2004, Making optimal use of empirical energy functions: Force field parameterization in crystal space. Proteins. 57, 678–683.

    Article  PubMed  CAS  Google Scholar 

  23. Kabsch W., Sander C. 1983. Dictionary of protein secondary structure: Pattern recognition of hydrogenbonded and geometrical features. Biopolymers. 22, 2577–2637; http://swift.cmbi.ru.nl/gv/dssp.

    Article  PubMed  CAS  Google Scholar 

  24. Schäffer A.A., Aravind L., Madden T.L., Shavirin S., Spouge J.L., Wolf Y.I., Koonin E.V., Altschul S.F. 2001. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 2994–3005.

    Article  PubMed  Google Scholar 

  25. Chothia C., Lesk A.M. 1986. The relation between the divergence of sequence and structure in proteins. EMBO J. 5, 823–826.

    PubMed  CAS  Google Scholar 

  26. Sunyaev S.R., Bogopolsky G.A., Oleinikova N.A., Vlasov P.K, Finkelstein A.V., Roytberg M.A. 2004. From analysis of protein structural alignments toward a novel approach to align protein sequences. Proteins. 54, 569–582.

    Article  PubMed  CAS  Google Scholar 

  27. Kosloff M., Kolodny R. 2008. Sequence-similar, structure-dissimilar protein pairs in PDB. Proteins. 71, 891–902.

    Article  PubMed  CAS  Google Scholar 

  28. Reva B.A., Finkelstein A.V., Skolnick J. 1998. What is the probability of a chance prediction of a protein structure with an RMSD of 6 Å? Fold. Des. 3, 141–147.

    Article  PubMed  CAS  Google Scholar 

  29. Lobanov M.Yu., Finkelstein A.V. 2009. Prediction of protein structure by analogy: II. Testing matrices of substitutions and pseudopoteintials used in protein sequence alignments with spatial matrices. Mol. Biol. 43, 733–740.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. V. Finkel’shtein.

Additional information

Original Russian Text © M.Yu. Lobanov, N.S. Bogatyreva, D.N. Ivankov, A.V. Finkel’shtein, 2009, published in Molekulyarnaya Biologiya, 2009, Vol. 43, No. 4, pp. 722–732.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lobanov, M.Y., Bogatyreva, N.S., Ivankov, D.N. et al. Analogy-based protein structure prediction: I. A new database of spatially similar and dissimilar structures of protein domains for testing and optimizing prediction methods. Mol Biol 43, 665–676 (2009). https://doi.org/10.1134/S0026893309040190

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0026893309040190

Key words

Navigation