Molecular Biotechnology

, Volume 48, Issue 2, pp 183–198 | Cite as

Protein Structure Databases

  • Roman A. Laskowski


Web-based protein structure databases come in a wide variety of types and levels of information content. Those having the most general interest are the various atlases that describe each experimentally determined protein structure and provide useful links, analyses and schematic diagrams relating to its 3D structure and biological function. Also of great interest are the databases that classify 3D structures by their folds as these can reveal evolutionary relationships which may be hard to detect from sequence comparison alone. Related to these are the numerous servers that compare folds—particularly useful for newly solved structures, and especially those of unknown function. Beyond these there are a vast number of databases for the most specialized user, dealing with specific families, diseases, structural features and so on.


Protein structure Protein Data Bank PDB wwPDB Secondary structure Fold classification 


  1. 1.
    Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Jr., Brice, M. D., Rodgers, J. R., et al. (1977). The Protein Data Bank: a computer-based archival file of macromolecular structures. Journal of Molecular Biology, 112, 535–542.CrossRefGoogle Scholar
  2. 2.
    Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., et al. (2000). The Protein Data Bank. Nucleic Acids Research, 28, 235–242.CrossRefGoogle Scholar
  3. 3.
    Berman, H. M., Henrick, K., & Nakamura, H. (2003). Announcing the worldwide Protein Data Bank. Nature Structural Biology, 10, 980.CrossRefGoogle Scholar
  4. 4.
    Westbrook, J., & Fitzgerald, P. M. (2003). The PDB format, mmCIF, and other data formats. Methods of Biochemical Analysis, 44, 161–179.Google Scholar
  5. 5.
    Westbrook, J., Ito, N., Nakamura, H., Henrick, K., & Berman, H. M. (2005). PDBML: The representation of archival macromolecular structure data in XML. Bioinformatics, 21, 988–992.CrossRefGoogle Scholar
  6. 6.
    Brändén, C.-I., & Jones, T. A. (1990). Between objectivity and subjectivity. Nature, 343, 687–689.CrossRefGoogle Scholar
  7. 7.
    Hooft, R. W. W., Vriend, G., Sander, C., & Abola, E. E. (1996). Errors in protein structures. Nature, 381, 272.CrossRefGoogle Scholar
  8. 8.
    Kleywegt, G. J. (2000). Validation of protein crystal structures. Acta Crystallographica, D56, 249–265.Google Scholar
  9. 9.
    Laskowski, R. A. (2009). Structural quality assurance. In J. Gu & P. E. Bourne (Eds.), Structural Bioinformatics (2nd ed., pp. 341–375). Hoboken, NJ: Wiley.Google Scholar
  10. 10.
    Brown, E. N., & Ramaswamy, S. (2007). Quality of protein crystal structures. Acta Crystallographica, D63, 941–950.Google Scholar
  11. 11.
    Krissinel, E., & Henrick, K. (2007). Inference of macromolecular assemblies from crystalline state. Journal of Molecular Biology, 372, 774–797.CrossRefGoogle Scholar
  12. 12.
    Hühne, R., Koch, F. T., & Sühnel, J. (2007). A comparative view at comprehensive information resources on three-dimensional structures of biological macro-molecules. Briefings in Functional Genomics & Proteomics, 6, 220–239.CrossRefGoogle Scholar
  13. 13.
    Murzin, A. G., Brenner, S. E., Hubbard, T., & Chothia, C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology, 247, 536–540.Google Scholar
  14. 14.
    Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B., & Thornton, J. M. (1997). CATH: a hierarchic classification of protein domain structures. Structure, 5, 1093–1108.CrossRefGoogle Scholar
  15. 15.
    Finn, R. D., Mistry, J., Schuster-Böckler, B., Griffiths-Jones, S., Hollich, V., Lassmann, T., et al. (2006). Pfam: Clans, web tools and services. Nucleic Acids Research, 34, D247–D251.CrossRefGoogle Scholar
  16. 16.
    Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., et al. (2004). The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Research, 32, D262–D266.CrossRefGoogle Scholar
  17. 17.
    Hanson, R. M. (2010). Jmol: A paradigm shift in crystallographic visualization. Journal of Applied Crystallography, 43, 1250–1260.CrossRefGoogle Scholar
  18. 18.
    Moreland, J. L., Gramada, A., Buzko, O. V., Zhang, Q., & Bourne, P. E. (2005). The Molecular Biology Toolkit (MBT): A modular platform for developing molecular visualization applications. BMC Bioinformatics, 6, 21.CrossRefGoogle Scholar
  19. 19.
    Stierand, K., Maass, P. C., & Rarey, M. (2006). Molecular complexes at a glance: Automated generation of two-dimensional complex diagrams. Bioinformatics, 22, 1710–1716.CrossRefGoogle Scholar
  20. 20.
    Lovell, S. C., Davis, I. W., Arendall, W. B., I. I. I., de Bakker, P. I. W., Word, J. M., Prisant, M. G., et al. (2003). Structure validation by C-alpha geometry: phi, psi, and C-beta deviation. Proteins Structure Function and Genetics, 50, 437–450.CrossRefGoogle Scholar
  21. 21.
    Boutselakis, H., Dimitropoulos, D., Fillon, J., Golovin, A., Henrick, K., Hussain, A., et al. (2003). E-MSD: The European Bioinformatics Institute Macromolecular Structure Database. Nucleic Acids Research, 31, 458–462.CrossRefGoogle Scholar
  22. 22.
    Golovin, A., Oldfield, T. J., Tate, J. G., Velankar, S., Barton, G. J., Boutselakis, H., et al. (2004). E-MSD: An integrated data resource for bioinformatics. Nucleic Acids Research, 32, D211–D216.CrossRefGoogle Scholar
  23. 23.
    Velankar, S., McNeil, P., Mittard-Runte, V., Suarez, A., Barrell, D., Apweiler, R., et al. (2005). E-MSD: An integrated data resource for bioinformatics. Nucleic Acids Research, 33, D262–D265.CrossRefGoogle Scholar
  24. 24.
    Tagari, M., Tate, J., Swaminathan, G. J., Newman, R., Naim, A., Vranken, W., et al. (2006). E-MSD: Improving data deposition and structure quality. Nucleic Acids Research, 34, D287–D290.CrossRefGoogle Scholar
  25. 25.
    Golovin, A., Dimitropoulos, D., Oldfield, T., Rachedi, A., & Henrick, K. (2005). MSDsite: A database search and retrieval system for the analysis and viewing of bound ligands and active sites. Proteins, 58, 190–199.CrossRefGoogle Scholar
  26. 26.
    Krissinel, E., & Henrick, K. (2004). Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallographica, D60, 2256–2268.Google Scholar
  27. 27.
    Hartshorn, M. J. (2002). AstexViewer: A visualisation aid for structure-based drug design. Journal of Computer-Aided Molecular Design, 16, 871–881.CrossRefGoogle Scholar
  28. 28.
    Oldfield, T. J. (2004). A Java applet for multiple linked visualization of protein structure and sequence. Journal of Computer-Aided Molecular Design, 18, 225–234.CrossRefGoogle Scholar
  29. 29.
    Reichert, J., & Sühnel, J. (2002). The IMB Jena Image Library of Biological Macromolecules: 2002 update. Nucleic Acids Research, 30, 253–254.CrossRefGoogle Scholar
  30. 30.
    Laskowski, R. A. (2009). PDBsum new things. Nucleic Acids Research, 37, D355–D359.CrossRefGoogle Scholar
  31. 31.
    Tamuri, A. U., & Laskowski, R. A. (2010). ArchSchema: A tool for interactive graphing of related Pfam domain architectures. Bioinformatics, 26, 1260–1261.CrossRefGoogle Scholar
  32. 32.
    Laskowski, R. A., MacArthur, M. W., Moss, D. S., & Thornton, J. M. (1993). PROCHECK—a program to check the stereochemical quality of protein structures. Journal of Applied Crystallography, 26, 283–291.CrossRefGoogle Scholar
  33. 33.
    Laskowski, R. A. (2007). Enhancing the functional annotation of PDB structures in PDBsum using key figures extracted from the literature. Bioinformatics, 23, 1824–1827.CrossRefGoogle Scholar
  34. 34.
    Porter, C. T., Bartlett, G. J., & Thornton, J. M. (2004). The Catalytic Site Atlas: A resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Research, 32, D129–D133.CrossRefGoogle Scholar
  35. 35.
    Sigrist, C. J. A., Cerutti, L., Hulo, N., Gattiker, A., Falquet, L., Pagni, M., et al. (2002). PROSITE: A documented database using patterns and profiles as motif descriptors. Briefings in Bioinformatics, 3, 265–274.CrossRefGoogle Scholar
  36. 36.
    Glaser, F., Rosenberg, Y., Kessel, A., Pupko, T., & Ben Tal, N. (2004). The ConSurf-HSSP database: The mapping of evolutionary conservation among homologs onto PDB structures. Proteins, 58, 610–617.CrossRefGoogle Scholar
  37. 37.
    Wallace, A. C., Laskowski, R. A., & Thornton, J. M. (1995). LIGPLOT: A program to generate schematic diagrams of protein-ligand interactions. Protein Engineering, 8, 127–134.CrossRefGoogle Scholar
  38. 38.
    Luscombe, N. M., Laskowski, R. A., & Thornton, J. M. (1997). NUCPLOT: A program to generate schematic diagrams of protein-nucleic acid interactions. Nucleic Acids Research, 25, 4940–4945.CrossRefGoogle Scholar
  39. 39.
    Kulikova, T., Akhtar, R., Aldebert, P., Althorpe, N., Andersson, M., Baldwin, A., et al. (2007). EMBL Nucleotide Sequence Database in 2006. Nucleic Acids Research, 35, D16–D20.CrossRefGoogle Scholar
  40. 40.
    Schwede, T., Kopp, J., Guex, N., & Peitsch, M. C. (2003). SWISS-MODEL: An automated protein-homology server. Nucleic Acids Research, 31, 3381–3385.CrossRefGoogle Scholar
  41. 41.
    Eyrich, V. A., Marti-Renom, M. A., Przybylski, D., Madhusudhan, M. S., Fiser, A., Pazos, F., et al. (2001). EVA: Continuous automatic evaluation of protein structure prediction servers. Bioinformatics, 17, 1242–1243.CrossRefGoogle Scholar
  42. 42.
    Kopp, J., & Schwede, T. (2004). The SWISS-MODEL Repository of annotated three-dimensional protein structure homology models. Nucleic Acids Research, 32, D230–D234.CrossRefGoogle Scholar
  43. 43.
    Pieper, U., Eswar, N., Braberg, H., Madhusudhan, M. S., Davis, F. P., Stuart, A. C., et al. (2004). MODBASE: A database of annotated comparative protein structure models and associated resources. Nucleic Acids Research, 32, D217–D222.CrossRefGoogle Scholar
  44. 44.
    Moult, J. (2005). A decade of CASP: Progress, bottlenecks and prognosis in protein structure prediction. Current Opinion in Structural Biology, 15, 285–289.CrossRefGoogle Scholar
  45. 45.
    Bujnicki, J. M., Elofsson, A., Fischer, D., & Rychlewski, L. (2001). Livebench-1: Continuous benchmarking of protein structure prediction servers. Protein Science, 10, 352–361.CrossRefGoogle Scholar
  46. 46.
    Marsden, R. L., Ranea, J. A. G., Sillero, A., Redfern, O., Yeats, C., Maibaum, M., et al. (2006). Exploiting protein structure data to explore the evolution of protein function and biological complexity. Philosophical Transactions of the Royal Society B: Biological Sciences, 361, 425–440.CrossRefGoogle Scholar
  47. 47.
    Jefferson, E. R., Walsh, T. P., & Barton, G. J. (2008). A comparison of SCOP and CATH with respect to domain-domain interactions. Proteins, 70, 54–62.CrossRefGoogle Scholar
  48. 48.
    Kolodny, R., Petrey, D., & Honig, B. (2006). Protein structure comparison: Implications for the nature of ‘fold space’, and structure and function prediction. Current Opinion in Structural Biology, 16, 393–398.CrossRefGoogle Scholar
  49. 49.
    Orengo, C. A., Jones, D. T., & Thornton, J. M. (1994). Protein superfamilies and domain superfolds. Nature, 372, 631–634.CrossRefGoogle Scholar
  50. 50.
    Novotny, M., Madsen, D., & Kleywegt, G. J. (2004). Evaluation of protein fold comparison servers. Proteins, 54, 260–270.CrossRefGoogle Scholar
  51. 51.
    Carugo, O. (2006). Rapid methods for comparing protein structures and scanning structure datahases. Current Bioinformatics, 1, 75–83.CrossRefGoogle Scholar
  52. 52.
    Kleywegt, G. J., Harris, M. R., Zou, J-y, Taylor, T. C., Wählby, A., & Jones, T. A. (2004). The uppsala electron-density server. Acta Crystallographica, D60, 2240–2249.Google Scholar
  53. 53.
    Chen, J., Anderson, J. B., DeWeese-Scott, C., Fedorova, N. D., Geer, L. Y., He, S., et al. (2003). MMDB: Entrez’s 3D-structure database. Nucleic Acids Research, 31, 474–477.CrossRefGoogle Scholar
  54. 54.
    Bates, P. A., Kelley, L. A., MacCallum, R. M., & Sternberg, M. J. E. (2001). Enhancement of protein modelling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM. Proteins, 5, 39–46.CrossRefGoogle Scholar
  55. 55.
    Lund, O., Frimand, K., Gorodkin, J., Bohr, H., Bohr, J., Hansen, J., et al. (1997). Protein distance constraints predicted by neural networks and probability density functions. Protein Engineering, 10, 1241–1248.CrossRefGoogle Scholar
  56. 56.
    Lambert, C., Leonard, N., De Bolle, X., & Depiereux, E. (2002). ESyPred3D: Prediction of proteins 3D structures. Bioinformatics, 18, 1250–1256.CrossRefGoogle Scholar
  57. 57.
    Pearl, F., Todd, A., Sillitoe, I., Dibley, M., Redfern, O., Lewis, T., et al. (2005). The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucleic Acids Research, 33, D247–D251.CrossRefGoogle Scholar
  58. 58.
    Shindyalov, I. N., & Bourne, P. E. (1998). Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering, 11, 739–747.CrossRefGoogle Scholar
  59. 59.
    Holm, L., & Sander, C. (1996). Mapping the protein universe. Science, 273, 595–603.CrossRefGoogle Scholar
  60. 60.
    Marti-Renom, M. A., Pieper, U., Madhusudhan, M. S., Rossi, A., Eswar, N., Davis, F. P., et al. (2007). DBAli tools: Mining the protein structure space. Nucleic Acids Research, 35, W393–W397.CrossRefGoogle Scholar
  61. 61.
    Ye, Y., & Godzik, A. (2003). Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics, 19, ii246–ii255.Google Scholar
  62. 62.
    Kawabata, T. (2003). MATRAS: A program for protein 3D structure comparison. Nucleic Acids Research, 31, 3367–3369.CrossRefGoogle Scholar
  63. 63.
    Martin, A. C. R. (2000). The ups and downs of protein topology; rapid comparison of protein structure. Protein Engineering, 13, 829–837.CrossRefGoogle Scholar
  64. 64.
    Gibrat, J. F., Madej, T., & Bryant, S. H. (1996). Surprising similarities in structure comparison. Current Opinion in Structural Biology, 6, 377–385.CrossRefGoogle Scholar
  65. 65.
    Chandonia, J. M., Hon, G., Walker, N. S., Lo Conte, L., Koehl, P., Levitt, M., et al. (2004). The ASTRAL compendium in 2004. Nucleic Acids Research, 32, D189–D192.CrossRefGoogle Scholar
  66. 66.
    Hobohm, U., Scharf, M., Schneider, R., & Sander, C. (1992). Selection of representative protein data sets. Protein Science, 1, 409–417.CrossRefGoogle Scholar
  67. 67.
    Wang, G., & Dunbrack, R. L., Jr. (2003). PISCES: A protein sequence culling server. Bioinformatics, 19, 1589–1591.CrossRefGoogle Scholar
  68. 68.
    Huang, Z., Zhu, L., Cao, Y., Wu, G., Liu, X., Chen, Y., Wang, Q., Shi, T., Zhao, Y., Wang, Y., Li, W., Li, Y., Chen, H., Chen, G., & Zhang, J. (2011). ASD: A comprehensive database of allosteric proteins and modulators. Nucleic Acids Research, 39, D663–D669.CrossRefGoogle Scholar
  69. 69.
    Gerstein, M., & Krebs, W. (1998). A database of macromolecular motions. Nucleic Acids Research, 26, 4280–4290.CrossRefGoogle Scholar
  70. 70.
    Lomize, M. A., Lomize, A. L., Pogozheva, I. D., & Mosberg, H. I. (2006). OPM: Orientations of proteins in membranes database. Bioinformatics, 22, 623–625.CrossRefGoogle Scholar
  71. 71.
    Lai, Y. L., Yen, S. C., Yu, S. H., & Hwang, J. K. (2007). pKNOT: The protein KNOT web server. Nucleic Acids Research, 35, W420–W424.CrossRefGoogle Scholar
  72. 72.
    Kolesov, G., Virnau, P., Kardar, M., & Mirny, L. A. (2007). Protein knot server: Detection of knots in protein structures. Nucleic Acids Research, 35, W425–W428.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.European Bioinformatics Institute, Wellcome Trust Genome CampusHinxton, CambridgeUK

Personalised recommendations