Molecular Biotechnology

, Volume 34, Issue 1, pp 69–93 | Cite as

Cataloging the relationships between proteins

A review of interaction databases
  • Carol Rohl
  • Yancey Price
  • Tiffany B. Fischer
  • Milissa Paczkowski
  • Michael F. Zettel
  • Jerry Tsai


By organizing and making widely accessible the increasing amounts of data from high-throughput analyses, protein interaction databases have become an integral resource for the biological community in relating sequence data with higher-order function. To provide a sense of the use and applicability of these databases, we describe each of the major comprehensive interaction databases as well as some of the more specialized ones. Content description, search/browse functionalities, and data presentation are discussed. A succinct explanation of database contents helps the user quickly identify whether the database contains applicable information to their research interest. Broad levels of search/browse functions as well as descriptions/examples allow users to quickly find and access pertinent data. At this point, clear presentation of search results as well as the primary content is necessary. Many databases display information graphically or divided into smaller digestible parts over a number of tabbed/linked pages. In addition, cross-linking between the databases promotes interconnectivity of the data and is an added layer of relational data for the user. Overall, although these protein interaction databases are under continual improvement, their current state shows that much time and effort has gone into organizing and presenting these large sets of data-describing protein interactions.

Index Entries

Biological databases protein interactions protein pathways proteomics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zhu, H., Bilgin, M., and Snyder, M. (2003) Proteomics. Annu. Rev. Biochem. 72, 783–812.PubMedCrossRefGoogle Scholar
  2. 2.
    Ito, T., Ota, K., Kubota, H., et al. Roles for the two hybrid system in exploration of the yeast protein interactome. Mol. Cell Proteomics 1, 561–566.Google Scholar
  3. 3.
    Fodor, S. P., Rava, R. P., Huang, X. C., Pease, A. C., Holmes, C. P., and Adams, C. L. (1993) Multiplexed biochemical assays with biological chips. Nature 364, 555–556.PubMedCrossRefGoogle Scholar
  4. 4.
    Schena, M., Shalon, D., Heller, R., et al. (1996) Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. Proc. Natl. Acad. Sci. USA 93, 10,614–10,619.CrossRefGoogle Scholar
  5. 5.
    Fields S. and Song, O. (1989) A novel genetic system to detect protein-protein interactions. Nature 340, 245–246.PubMedCrossRefGoogle Scholar
  6. 6.
    O'Farrell, P. Z., Goodman, H. M., and O'Farrell, P. H. (1977) High resolution two-dimensional electrophoresis of basic as well as acidic proteins. Cell 12, 1133–1141.PubMedCrossRefGoogle Scholar
  7. 7.
    Figeys, D., McBroom, L. D., and Moran, M. F. (2001) Mass spectrometry for the study of protein-protein interactions. Methods 24, 230–239.PubMedCrossRefGoogle Scholar
  8. 8.
    Yandell, M. D. and Majoros, W. H. (2002) Genomics and natural language processing. Nat. Rev. Genet. 3, 601–610.PubMedGoogle Scholar
  9. 9.
    Hirschman, L., Park, J. C., Tsujii, J., Wong, L., and Wu, C. H. (2002) Accomplishments and challenges in literature data mining for biology. Bioinformatics 18, 1553–1561.PubMedCrossRefGoogle Scholar
  10. 10.
    Berman, H. M., Battistuz, T., Bhat, T. N., et al. (2002) The Protein Data Bank. Acta Crystallogr. D. Biol. Crystallogr. 58, 899–907.PubMedCrossRefGoogle Scholar
  11. 11.
    Marino-Ramirez, L., Campbell, L., and Hu, J. C. (2003) Screening peptide/protein libraries fused to the lambda repressor DNA-binding domain in E. coli cells. Methods Mol. Biol. 205, 235–250.PubMedGoogle Scholar
  12. 12.
    Marcotte, E. M., Xenarios, I., and Eisenberg, D. (2001) Mining literature for protein-protein interactions. Bioinformatics 17, 359–363.PubMedCrossRefGoogle Scholar
  13. 13.
    McDermott, J., and Samudrala, R. (2003) Bioverse: functional, structural and contextual annotation of proteins and proteomes. Nucleic Acids Res. 31, 3736–3737.PubMedCrossRefGoogle Scholar
  14. 14.
    Nanao, M. H., Zhou, W., Pfaffinger, P. J., and Choe, S. (2003) Determining the basis of channel-tetramerization specificity by x-ray crystallography and a sequence-comparison algorithm: Family Values (FamVal). Proc. Natl. Acad. Sci. USA 100, 8670–8675.PubMedCrossRefGoogle Scholar
  15. 15.
    Puntervoll, P., Linding, R., Gemund, C., et al. (2003) ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res. 31, 3625–3630.PubMedCrossRefGoogle Scholar
  16. 16.
    Kim, W. K., Park, J., and Suh, J. K. (2002) Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. Genome Inform. Ser. Workshop Genome Inform. 13, 42–50.Google Scholar
  17. 17.
    Deane, C. M., Salwinski, L., Xenarios, I., and Eisenberg, D. (2002) Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol. Cell Proteomics, 1, 349–356.PubMedCrossRefGoogle Scholar
  18. 18.
    Xenarios, I., Salwinski, L., Duan, X. J., Higney, P., Kim, S. M., and Eisenberg D. (2002) DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303–305.PubMedCrossRefGoogle Scholar
  19. 19.
    Xenarios, I., Rice, D. W., Salwinski, L., Baron, M. K., Marcotte, E. M., and Eisenberg, D. (2000) DIP: the database in interacting proteins. Nucleic Acids Res. 28, 289–291.PubMedCrossRefGoogle Scholar
  20. 20.
    Hodges, P. E., Payne, W. E., and Garrels, J. I. (1998) The Yeast Protein Database (YPD): a curated proteome database for Saccharomyces cerevisiae. Nucleic Acids Res. 26, 68–72.PubMedCrossRefGoogle Scholar
  21. 21.
    Karp, P. D., Riley, M., Saier, M., Paulsen, I. T., et al. (2002) The EcoCyc database. Nucleic Acids Res. 30, 56–58.PubMedCrossRefGoogle Scholar
  22. 22.
    Giot, L., Bader, J. S., Brouwer, C., et al. (2003) A protein interaction map of Drosophila melanogaster, Science, 302, 1727–1736.PubMedCrossRefGoogle Scholar
  23. 23.
    Kanehisa, M. (2002) The KEGG database Novartis Found Symp. 247, 91–101; discussion 101–103, 119–128, 244–152.PubMedCrossRefGoogle Scholar
  24. 24.
    Takai-Igarashi, T., Nadaoka, Y., and Kaminuma, T., (1988). A database for cell signaling networks. J. Comput. Biol. 5, 747–754.CrossRefGoogle Scholar
  25. 25.
    Bader, G. D., Betel, D., and Hogue, C. W. (2003) BIND: the Biomolecular Interaction Network Data-base. Nucleic Acids Res. 31, 248–250.PubMedCrossRefGoogle Scholar
  26. 26.
    Bader, G. D. and Hogue, C. W. (2000) BIND—a data specification for storing and describing biomolecular interactions, molecular complexes and pathways. Bioinformatics 16, 465–477.PubMedCrossRefGoogle Scholar
  27. 27.
    Zanzoni, A., Montecchi-Palazzi, L., Quondam, M., Ausiello, G., Helmer-Citterich, M., and Cesareni, G. (2002) MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140.PubMedCrossRefGoogle Scholar
  28. 28.
    Orchard, S., Hermajakob, H., and Apweiler, R. (2003) The proteomics standards initiative, Proteomics 3, 1374–1376.PubMedCrossRefGoogle Scholar
  29. 29.
    Apweiler, R., Attwood, T. K., Bairoch, A., et al. (2000) InterPro—an integrated documentation resource for protein families, domains and functional sites. Bioinformatics 16, 1145–1150.PubMedCrossRefGoogle Scholar
  30. 30.
    Apweiler, R., Attwood, T. K., Bairoch, A., et al. (2001) The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 29, 37–40.PubMedCrossRefGoogle Scholar
  31. 31.
    Bateman, A. and Haft, D.H. (2002) HMM-based databases in InterPro. Brief Bioinform. 3, 236–245.PubMedCrossRefGoogle Scholar
  32. 32.
    Biswas, M., O'Rourke, J. F., Camon, E. et al. (2002) Applications of InterPro in Protein annotation and genome analysis. Brief Bioinform. 3, 285–295.PubMedCrossRefGoogle Scholar
  33. 33.
    Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2003) The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res. 31, 315–318.PubMedCrossRefGoogle Scholar
  34. 34.
    Mulder, N. J., Apweiler, R., Attwood, T. K. et al. (2002) InterPro: an integrated documentation resource for protein families, domains and functional sites. Brief Bioinform. 3, 225–235.PubMedCrossRefGoogle Scholar
  35. 35.
    Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res. 33 (Database issue), D201-D205.PubMedCrossRefGoogle Scholar
  36. 36.
    Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., and Wheeler, D. L. (2005) GenBank. Nucleic Acids Res. 33 (Database issue), D34-D38.PubMedCrossRefGoogle Scholar
  37. 37.
    McDermott, J., Guerquin, M., Frazier, Z., Chang, A. N., and Samudrala, R. (2005) BIOVERSE: enhancements to the framework for structural, functional and contextual modeling of proteins and proteomes. Nucleic Acids Res. 33(Web Server issue), W324-W325.PubMedCrossRefGoogle Scholar
  38. 38.
    Chang, A. N., McDermott, J., and Samudrala, R. (2005) An enhanced Java graph applet interface for visualizing interactomes. Bioinformatics, 21, 1741–1742.PubMedCrossRefGoogle Scholar
  39. 39.
    Lemer, C., Antezana, E., Couche, F., et al. (2004) The aMAZE LightBench: a web interface to a relational database of cellular processes. Nucleic Acids Res. 32(Database issue), D443-D448.PubMedCrossRefGoogle Scholar
  40. 40.
    van Helden, J., Naim, A., Lemer, C., Mancuso, R., Eldridge, M., and Wodak, S. J. (2001) From molecular activities and processes to biological function. Brief Bioinform. 2, 81–93.PubMedCrossRefGoogle Scholar
  41. 41.
    van Helden, J., Naim, A., Mancuso, R., et al. (2000) Representing and analysing molecular and cellular function using the computer. Biol. Chem. 381, 921–935.PubMedCrossRefGoogle Scholar
  42. 42.
    Schneider, M., Tognolli, M., and Bairoch, A. (2004) The Swiss-Prot protein knowledgebase and ExPASy: providing the plant community with high quality proteomic data and tools. Plant Physiol. Biochem. 42, 1013–1021.PubMedCrossRefGoogle Scholar
  43. 43.
    Bairoch, A., Boeckmann, B., Ferro, S., and Gasteiger, E. (2004) Swiss-Prot: juggling between evolution and stability. Brief Bioinform. 5, 39–55.PubMedCrossRefGoogle Scholar
  44. 44.
    Heinemeyer, T., Chen, X., Karas, H. et al. (1999) Expanding the TRANSFAC database towards an expert system regulatory molecular mechanisms. Nucleic Acids Res. 27, 318–322.PubMedCrossRefGoogle Scholar
  45. 45.
    Salgado, H., Santos-Zavaleta, A., Gama-Castro, S., et al. (2001) Regulon DB (version 3.2): transcriptional regulation and operon organization in Escherichia coli K-12. Nucleic Acids Res. 29, 72–74.PubMedCrossRefGoogle Scholar
  46. 46.
    Mewes, H. W., Frishman, D., Guldener, U., et al. (2002) MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 30, 31–34.PubMedCrossRefGoogle Scholar
  47. 47.
    Mewes, H. W., Frishman, D., Gruber, C., et al. (2000) MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 28, 37–40.PubMedCrossRefGoogle Scholar
  48. 48.
    Mewes, H. W., Heumann, K., Kaps, A., et al. (1999) MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 27, 44–48.PubMedCrossRefGoogle Scholar
  49. 49.
    Mewes, H. W., Hani, J., Pfeiffer, F., and Frishman, D. (1998) MIPS: a database for protein sequences and complete genomes. Nucleic Acids Res. 26, 33–37.PubMedCrossRefGoogle Scholar
  50. 50.
    Mewes, H. W., Albermann K., Heumann, K., Liebl, S. and Pfeiffer, F. (1997) MIPS: a database for protein sequences, homology data and yeast genome information. Nucleic Acids Res. 25, 28–30.PubMedCrossRefGoogle Scholar
  51. 51.
    Breitkreutz, B. J., Stark, C., and Tyers, M. (2002) The GRID: the General Repository for Interaction Datasets. Genome Biol. 3, PREPRINT0013.Google Scholar
  52. 52.
    Breitkreutz, B. J. Stark, C. and Tyers, M. (2003) The GRID: the General Repository for Interaction Datasets. Genome Biol. 4, R23.PubMedCrossRefGoogle Scholar
  53. 53.
    Breitkreutz, B. J., Stark, C., and Tyers, M. (2003) Osprey: a network visualization system. Genome Biol. 4, R23.PubMedCrossRefGoogle Scholar
  54. 54.
    Breitkreutz, B. J., Stark, C., and Tyers, M. (2002) Osprey: a network visualization system. Genome Biol. 3, PREPRINT0012.Google Scholar
  55. 55.
    Kikuno, R., Nagase, T., Suyama, M., Waki, M., Hirosawa, M., and Ohara, O. (2000) HUGE: a data-base for human large proteins identified in the Kazusa cDNA sequencing project. Nucleic Acids Res. 28, 331–332.PubMedCrossRefGoogle Scholar
  56. 56.
    Kikuno, R., Nagase, T., Waki, M., and Ohara, O. (2002) HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing project. Nucleic Acids Res. 30, 166–168.PubMedCrossRefGoogle Scholar
  57. 57.
    Govindarajan, K. R., Kangueane, P., Tan, T. W., and Ranganathan, S. (2003) MPID: MHC-Peptide Interaction Database for sequence-structure-function information on peptides binding to MHC molecules. Bioinformatics 19, 309–310.PubMedCrossRefGoogle Scholar
  58. 58.
    Laskowski, R. A. (1995) SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. J. Mol. Graph. 13, 323–330. 307–328.PubMedCrossRefGoogle Scholar
  59. 59.
    Wallace, A. C., Laskowski, R. A., and Thornton, J. M. (1995) LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Eng. 8, 127–134.PubMedCrossRefGoogle Scholar
  60. 60.
    Pharkya, P., Nikolaev, E. V., and Maranas, C. D. (2003) Review of the BRENDA database. Metab. Eng. 5, 71–73.PubMedCrossRefGoogle Scholar
  61. 61.
    Prabakaran, P. An, J., Gromiha, M. M., et al., (2001) Thermodynamic database for protein-nucleic acid interactions (ProNIT). Bioinformatics 17, 1027–1034.PubMedCrossRefGoogle Scholar
  62. 62.
    Chen, X., Lin, Y., Liu, M., and Gilson, M. K. (2002) The Binding Database: data management and interface design. Bioinformatics 18, 130–139.PubMedCrossRefGoogle Scholar
  63. 63.
    Chen, X., Liu, M., and Gilson, M. K. (2001) Binding DB: a web-accessible molecular recognition database. Comb. Chem. High Throughput Screen. 4, 719–725.PubMedGoogle Scholar
  64. 64.
    Chen, X., Lin, Y., and Gilson, M. K. (2001) The binding database: overview and user's guide. Biopolymers 61, 127–141.PubMedCrossRefGoogle Scholar
  65. 65.
    Nicholls, A., Sharp, K. A., and Honig, B. (1991) Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 11, 281–296.PubMedCrossRefGoogle Scholar
  66. 66.
    Thorn, K. S. and Bogan, A. A. (2001) ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17, 284–285.PubMedCrossRefGoogle Scholar
  67. 67.
    Bogan, A. A., and Thorn K. S. (1998) Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280, 1–9.PubMedCrossRefGoogle Scholar
  68. 68.
    Fischer, T. B., Arunachalam, K. V., Bailey, D., et al. (2003) The binding interface database (BID): a compilation of amino acid hot spots in protein interfaces. Bioinformatics 19, 1453–1454.PubMedCrossRefGoogle Scholar
  69. 69.
    Sarai, A., Gromiha, M. M., An, J., et al. (2001) Thermodynamic databases for proteins and protein-nucleic acid interactions. Biopolymers 61, 121–126.PubMedCrossRefGoogle Scholar
  70. 70.
    Puvanendrampillai, D. and Mitchell, J. B. (2003) L/D Protein Ligand Database (PLD): additional understanding of the nature and specificity of proteinligand complexes. Bioinformatics 19, 1856–1857.PubMedCrossRefGoogle Scholar
  71. 71.
    Orengo, C. A., Bray, J. E., Buchan, D. W., et al. (2002) The CATH protein family database: a resource for structural and functional annotation of genomes. Proteomics 2, 11–21.PubMedCrossRefGoogle Scholar
  72. 72.
    Laskowski, R. A. (2001) PDBsum: summaries and analyses of PDB structures. Nucleic Acids Res. 29, 221–222.PubMedCrossRefGoogle Scholar
  73. 73.
    Laskowski, R. A., Hutchinson, E. G., Michie, A. D., Wallace, A. C., Jones, M. L., and Thornton, J. M. (1997) PDBsum: a Web-based database of summaries and analyses of all PDB structures. Trends Biochem. Sci. 22, 488–490.PubMedCrossRefGoogle Scholar
  74. 74.
    Pandey, A. (2001) Common standards for genomics and proteomics. Trends Genet. 17, 442.Google Scholar

Copyright information

© Humana Press Inc 2006

Authors and Affiliations

  • Carol Rohl
    • 1
  • Yancey Price
    • 2
  • Tiffany B. Fischer
    • 2
  • Milissa Paczkowski
    • 3
  • Michael F. Zettel
    • 4
  • Jerry Tsai
    • 2
  1. 1.Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck & Co., Inc.Seattle
  2. 2.Department of Biochemistry & Biophysics, Biochemistry/Biophysics Building, 2128 TAMUTexas A&M UniversityCollege Station
  3. 3.Department of Animal SciencePurdue UniversityWest Lafayette
  4. 4.Department of ChemistryCornell UniversityIthaca

Personalised recommendations