Protein-Protein Interaction Databases

  • Damian Szklarczyk
  • Lars Juhl JensenEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1278)


Years of meticulous curation of scientific literature and increasingly reliable computational predictions have resulted in creation of vast databases of protein interaction data. Over the years, these repositories have become a basic framework in which experiments are analyzed and new directions of research are explored. Here we present an overview of the most widely used protein-protein interaction databases and the methods they employ to gather, combine, and predict interactions. We also point out the trade-off between comprehensiveness and accuracy and the main pitfall scientists have to be aware before adopting protein interaction databases in any single-gene or genome-wide analysis.

Key words

Protein-protein interactions Functional associations Protein-protein interaction databases Pathways Protein-protein interaction prediction Biochemical pathways Selection bias 


  1. 1.
    Croft D, O’Kelly G, Wu G, Haw R et al (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res 39:D691–D697CrossRefPubMedCentralPubMedGoogle Scholar
  2. 2.
    Kanehisa M, Goto S, Furumichi M, Tanabe M et al (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 38:D355–D360CrossRefPubMedCentralPubMedGoogle Scholar
  3. 3.
    Kerrien S, Aranda B, Breuza L, Bridge A et al (2012) The IntAct molecular interaction database in 2012. Nucleic Acids Res 40:D841–D846CrossRefPubMedCentralPubMedGoogle Scholar
  4. 4.
    Stark C, Breitkreutz B-J, Chatr-Aryamontri A, Boucher L et al (2011) The BioGRID Interaction Database: 2011 update. Nucleic Acids Res 39:D698–D704CrossRefPubMedCentralPubMedGoogle Scholar
  5. 5.
    Szklarczyk D, Franceschini A, Kuhn M, Simonovic M et al (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39:D561–D568CrossRefPubMedCentralPubMedGoogle Scholar
  6. 6.
    Warde-Farley D, Donaldson SL, Comes O, Zuberi K et al (2010) The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 38:W214–W220CrossRefPubMedCentralPubMedGoogle Scholar
  7. 7.
    Goel R, Harsha HC, Pandey A, Prasad TSK (2012) Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis. Mol Biosyst 8:453–463CrossRefPubMedCentralPubMedGoogle Scholar
  8. 8.
    Cherry JM, Hong EL, Amundsen C, Balakrishnan R et al (2012) Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 40:D700–D705CrossRefPubMedCentralPubMedGoogle Scholar
  9. 9.
    Murali T, Pacifico S, Yu J, Guest S et al (2011) DroID 2011: a comprehensive, integrated resource for protein, transcription factor, RNA and gene interactions for Drosophila. Nucleic Acids Res 39:D736–D743CrossRefPubMedCentralPubMedGoogle Scholar
  10. 10.
    Goodman N, McCormick K, Goldowitz D, Hockly E et al (2003) Plans for HDBase—a research community website for Huntington’s Disease. Clin Neurosci Res 3:197–217CrossRefGoogle Scholar
  11. 11.
    Lechner M, Höhn V, Brauner B, Dunger I et al (2012) CIDeR: multifactorial interaction networks in human diseases. Genome Biol 13:R62CrossRefPubMedCentralPubMedGoogle Scholar
  12. 12.
    Dinkel H, Chica C, Via A, Gould CM et al (2011) Phospho.ELM: a database of phosphorylation sites–update 2011. Nucleic Acids Res 39:D261–D267CrossRefPubMedCentralPubMedGoogle Scholar
  13. 13.
    Caspi R, Foerster H, Fulcher CA, Kaipa P et al (2008) The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 36:D623–D631CrossRefPubMedCentralPubMedGoogle Scholar
  14. 14.
    Smoot ME, Ono K, Ruscheinski J, Wang P-L et al (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27:431–432CrossRefPubMedCentralPubMedGoogle Scholar
  15. 15.
    Brown KR, Otasek D, Ali M, McGuffin MJ et al (2009) NAViGaTOR: Network Analysis, Visualization and Graphing Toronto. Bioinformatics 25:3327–3329CrossRefPubMedCentralPubMedGoogle Scholar
  16. 16.
    Gene T., Consortium O. (2010) The Gene Ontology Consortium in 2010: extensions and refinements. Nucleic Acids Res 38:D331–D335CrossRefGoogle Scholar
  17. 17.
    Hakes L, Robertson DL, Oliver SG (2005) Effect of dataset selection on the topological interpretation of protein interaction networks. BMC Genomics 6:131CrossRefPubMedCentralPubMedGoogle Scholar
  18. 18.
    Salwinski L, Miller CS, Smith AJ, Pettit FK et al (2004) The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 32:D449–D451CrossRefPubMedCentralPubMedGoogle Scholar
  19. 19.
    Salwinski L, Eisenberg D (2007) The MiSink Plugin: cytoscape as a graphical interface to the Database of Interacting Proteins. Bioinformatics 23:2193–2195CrossRefPubMedGoogle Scholar
  20. 20.
    Deane CM, Salwiński Ł, Xenarios I, Eisenberg D (2002) Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics 1:349–356CrossRefPubMedGoogle Scholar
  21. 21.
    Deng M, Mehta S, Sun F, Chen T (2002) Inferring domain-domain interactions from protein-protein interactions. Genome Res 12:1540–1548CrossRefPubMedCentralPubMedGoogle Scholar
  22. 22.
    Graeber TG, Eisenberg D (2001) Bioinformatic identification of potential autocrine signaling loops in cancers from gene expression profiles. Nat Genet 29:295–300CrossRefPubMedGoogle Scholar
  23. 23.
    Hastings J, de Matos P, Dekker A, Ennis M et al (2013) The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res 41:D456–D463CrossRefPubMedCentralPubMedGoogle Scholar
  24. 24.
    Ceol A, Chatr Aryamontri A, Licata L, Peluso D et al (2010) MINT, the molecular interaction database: 2009 update. Nucleic Acids Res 38:D532–D539CrossRefPubMedCentralPubMedGoogle Scholar
  25. 25.
    Persico M, Ceol A, Gavrila C, Hoffmann R et al (2005) HomoMINT: an inferred human network based on orthology mapping of protein interactions discovered in model organisms. BMC Bioinformatics 6(Suppl 4):S21CrossRefPubMedCentralPubMedGoogle Scholar
  26. 26.
    Chatr-aryamontri A, Ceol A, Peluso D, Nardozza A et al (2009) VirusMINT: a viral protein interaction database. Nucleic Acids Res 37:D669–D673CrossRefPubMedCentralPubMedGoogle Scholar
  27. 27.
    Ceol A, Chatr-aryamontri A, Santonico E, Sacco R et al (2007) DOMINO: a database of domain-peptide interactions. Nucleic Acids Res 35:D557–D560CrossRefPubMedCentralPubMedGoogle Scholar
  28. 28.
    Amberger J, Bocchini C, Hamosh A (2011) A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®). Hum Mutat 32:564–567CrossRefPubMedGoogle Scholar
  29. 29.
    Kandasamy K, Mohan SS, Raju R, Keerthikumar S et al (2010) NetPath: a public resource of curated signal transduction pathways. Genome Biol 11:R3CrossRefPubMedCentralPubMedGoogle Scholar
  30. 30.
    Breuer K, Foroushani AK, Laird MR, Chen C et al (2013) InnateDB: systems biology of innate immunity and beyond–recent updates and continuing curation. Nucleic Acids Res 41:D1228–D1233CrossRefPubMedCentralPubMedGoogle Scholar
  31. 31.
    Bader GD, Donaldson I, Wolting C, Ouellette BF et al (2001) BIND–The Biomolecular Interaction Network Database. Nucleic Acids Res 29:242–245CrossRefPubMedCentralPubMedGoogle Scholar
  32. 32.
    Royer L, Reimann M, Andreopoulos B, Schroeder M (2008) Unraveling protein networks with power graph analysis. PLoS Comput Biol 4:e1000108CrossRefPubMedCentralPubMedGoogle Scholar
  33. 33.
    Barsky A, Gardy JL, Hancock REW, Munzner T (2007) Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular localization annotation. Bioinformatics 23:1040–1042CrossRefPubMedGoogle Scholar
  34. 34.
    Fu W, Sanders-Beer BE, Katz KS, Maglott DR et al (2009) Human immunodeficiency virus type 1, human protein interaction database at NCBI. Nucleic Acids Res 37:D417–D422CrossRefPubMedCentralPubMedGoogle Scholar
  35. 35.
    Resource Coordinators NCBI (2013) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 41:D8–D20CrossRefGoogle Scholar
  36. 36.
    Chen R, Jeong SS (2000) Functional prediction: identification of protein orthologs and paralogs. Protein Sci 9:2344–2353CrossRefPubMedCentralPubMedGoogle Scholar
  37. 37.
    Niu Y, Otasek D, Jurisica I (2010) Evaluation of linguistic features useful in extraction of interactions from PubMed; application to annotating known, high-throughput and predicted interactions in I2D. Bioinformatics 26:111–119CrossRefPubMedCentralPubMedGoogle Scholar
  38. 38.
    Hermjakob H, Montecchi-Palazzi L, Bader G, Wojcik J et al (2004) The HUPO PSI’s molecular interaction format–a community standard for the representation of protein interaction data. Nat Biotechnol 22:177–183CrossRefPubMedGoogle Scholar
  39. 39.
    Hart GT, Ramani AK, Marcotte EM (2006) How complete are current yeast and human protein-interaction networks? Genome Biol 7:120CrossRefPubMedCentralPubMedGoogle Scholar
  40. 40.
    Burns DM, Horn V, Paluh J, Yanofsky C (1990) Evolution of the tryptophan synthetase of fungi. Analysis of experimentally fused Escherichia coli tryptophan synthetase alpha and beta chains. J Biol Chem 265:2060–2069PubMedGoogle Scholar
  41. 41.
    Enright AJ, Iliopoulos I, Kyrpides NC, Ouzounis CA (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature 402:86–90CrossRefPubMedGoogle Scholar
  42. 42.
    Marcotte EM, Pellegrini M, Ng HL, Rice DW et al (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285:751–753CrossRefPubMedGoogle Scholar
  43. 43.
    Dandekar T, Snel B, Huynen M, Bork P (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci 23:324–328CrossRefPubMedGoogle Scholar
  44. 44.
    Overbeek R, Fonstein M, D’Souza M, Pusch GD et al (1999) Use of contiguity on the chromosome to predict functional coupling. In Silico Biol 1:93–108PubMedGoogle Scholar
  45. 45.
    Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95:14863–14868CrossRefPubMedCentralPubMedGoogle Scholar
  46. 46.
    Barrett T, Troup DB, Wilhite SE, Ledoux P et al (2011) NCBI GEO: archive for functional genomics data sets–10 years on. Nucleic Acids Res 39:D1005–D1010CrossRefPubMedCentralPubMedGoogle Scholar
  47. 47.
    Hirschman L, Park JC, Tsujii J, Wong L et al (2002) Accomplishments and challenges in literature data mining for biology. Bioinformatics 18:1553–1561CrossRefPubMedGoogle Scholar
  48. 48.
    Kuhn M, Szklarczyk D, Franceschini A, von Mering C et al (2012) STITCH 3: zooming in on protein-chemical interactions. Nucleic Acids Res 40:D876–D880CrossRefPubMedCentralPubMedGoogle Scholar
  49. 49.
    Powell S, Szklarczyk D, Trachana K, Roth A et al (2012) eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res 40:D284–D289CrossRefPubMedCentralPubMedGoogle Scholar
  50. 50.
    McDowall MD, Scott MS, Barton GJ (2009) PIPs: human protein-protein interaction prediction database. Nucleic Acids Res 37:D651–D656CrossRefPubMedCentralPubMedGoogle Scholar
  51. 51.
    Chautard E, Fatoux-Ardore M, Ballut L, Thierry-Mieg N et al (2011) MatrixDB, the extracellular matrix interaction database. Nucleic Acids Res 39:D235–D240CrossRefPubMedCentralPubMedGoogle Scholar
  52. 52.
    Goll J, Rajagopala SV, Shiau SC, Wu H et al (2008) MPIDB: the microbial protein interaction database. Bioinformatics 24:1743–1744CrossRefPubMedCentralPubMedGoogle Scholar
  53. 53.
    Lynn DJ, Winsor GL, Chan C, Richard N et al (2008) InnateDB: facilitating systems-level analyses of the mammalian innate immune response. Mol Syst Biol 4:218CrossRefPubMedCentralPubMedGoogle Scholar
  54. 54.
    The UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res 39:D214–D219CrossRefPubMedCentralGoogle Scholar
  55. 55.
    Aranda B, Blankenburg H, Kerrien S, Brinkman FSL et al (2011) PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat Methods 8:528–529CrossRefPubMedCentralPubMedGoogle Scholar
  56. 56.
    Sambourg L, Thierry-Mieg N (2010) New insights into protein-protein interaction data lead to increased estimates of the S cerevisiae interactome size. BMC Bioinformatics 11:605CrossRefPubMedCentralPubMedGoogle Scholar
  57. 57.
    Nakayama M, Kikuno R, Ohara O (2002) Protein-protein interactions between large proteins: two-hybrid screening using a functionally classified library composed of long cDNAs. Genome Res 12:1773–1784CrossRefPubMedCentralPubMedGoogle Scholar
  58. 58.
    Jeong H, Tombor B, Albert R, Oltvai ZN et al (2000) The large-scale organization of metabolic networks. Nature 407:651–654CrossRefPubMedGoogle Scholar
  59. 59.
    Wuchty S, Oltvai ZN, Barabási A-L (2003) Evolutionary conservation of motif constituents in the yeast protein interaction network. Nat Genet 35:176–179CrossRefPubMedGoogle Scholar
  60. 60.
    Von Mering C, Krause R, Snel B, Cornell M et al (2002) Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417:399–403CrossRefGoogle Scholar
  61. 61.
    Ioannidis JPA (2005) Why most published research findings are false. PLoS Med 2:e124CrossRefPubMedCentralPubMedGoogle Scholar
  62. 62.
    Tang JL (2005) Selection bias in meta-analyses of gene-disease associations. PLoS Med 2:e409CrossRefPubMedCentralPubMedGoogle Scholar
  63. 63.
    Pál C, Papp B, Hurst LD (2003) Genomic function: rate of evolution and gene dispensability. Nature 421:496–497, discussion 497–8CrossRefPubMedGoogle Scholar
  64. 64.
    Bloom JD, Adami C (2003) Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interactions data sets. BMC Evol Biol 3:21CrossRefPubMedCentralPubMedGoogle Scholar
  65. 65.
    Brito GC, Andrews DW (2011) Removing bias against membrane proteins in interaction networks. BMC Syst Biol 5:169CrossRefPubMedCentralPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein ResearchUniversity of CopenhagenKøbenhavn NDenmark

Personalised recommendations