Skip to main content

Manual Annotation of Protein Interactions

  • Protocol
  • First Online:
Protein Networks and Pathway Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 563))

Abstract

Protein interactions are the basic building blocks for assembly of pathways and networks. Almost any biologically meaningful functionality (for instance, linear signaling pathways, chains of metabolic reactions, transcription factor dimmers, protein complexes of transcriptosome, gene–disease associations) can be represented as a combination of binary relationships between “network objects” (genes, proteins, RNA species, bioactive compounds). Naturally, the assembled pathways and networks are only as good as their “weakest” link (i.e., a wrongly assigned interaction), and the errors multiply in multi-step pathways. Therefore, the utility of “systems biology” is fundamentally dependent on quality and relevance of protein interactions. The second important parameter is the sheer number of interactions assembled in the database. One needs a “critical mass” of species-specific interactions in order to build cohesive networks for a gene list, not a constellation of non-connected proteins and protein pairs. The third issue is semantic consistency between interactions of different types. Transient physical signal transduction interactions, reactions of endogenous metabolism, transcription factor–promoter binding, and kinetic drug–target interactions are all very different in nature. Yet, they have to fit well into one database format and be consistent in order to be useful in reconstruction of cellular processes.

High-quality protein interactions are available in peer-reviewed “small experiment” literature and, to a much smaller extent, patents. However, it is very challenging to find the interactions, annotate with searchable (and computable) parameters, catalogue in the database format in computer readable form, and assemble into a database. There are hundreds of thousands of mammalian interactions scattered in tens of thousands of papers in a few thousands of scientific journals. There are no widely used standards for reporting the interactions in scientific texts and, therefore, text-mining tools have only limited applicability. In order to generate a meaningful database of protein interactions, one needs a well-developed technology of manual curation, equipped with computational solutions, managerial procedures, quality control, and users’ feedback. Here we describe our ever-evolving annotation approach, the important annotation issues and our solutions, and the mammalian protein interactions database MetaBase™ which we have been working on for over 8 years.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kitano H. (2007) Towards a theory of biological robustness. Mol Syst Biol. 3, 137.

    Article  PubMed  Google Scholar 

  2. Barthelmes J, Ebeling C, Chang A, Schomburg I, Schomburg D. (2007) BRENDA, AMENDA and FRENDA: The enzyme information system in 2007. Nucleic Acids Res. 35, D511–D514.

    Article  PubMed  CAS  Google Scholar 

  3. Selkov JE, Grechkin Y, Mikhailova N, Selkov E. (1998) MPW: The Metabolic Pathways database. Nucleic Acids Res. 26, 43–45.

    Article  PubMed  CAS  Google Scholar 

  4. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, Katayama T, Kawashima S, Okuda S, Tokimatsu T, Yamanishi Y. (2008) KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, D480–D484.

    Article  PubMed  CAS  Google Scholar 

  5. Willis RC, Hogue CW. (2006) Searching, viewing, and visualizing data in the Biomolecular Interaction Network Database (BIND). Curr Protoc Bioinformatics. Chapter 8, Unit 8.9.

    Google Scholar 

  6. Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Mark P, Stümpflen V, Mewes HW, Ruepp A, Frishman D. (2005) The MIPS mammalian protein-protein interaction database. Bioinformatics. 21 (6), 832–834.

    Article  PubMed  CAS  Google Scholar 

  7. Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D. (2004) The database of interacting proteins: 2004 update. Nucleic Acids Res. 32, D449–D451.

    Article  PubMed  CAS  Google Scholar 

  8. Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, Mering CV. (2008) STRING 8 – A global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37, D412–416.

    Google Scholar 

  9. Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, Dimmer E, Feuermann M, Friedrichsen A, Huntley R, Kohler C, Khadake J, Leroy C, Liban A, Lieftink C, Montecchi-Palazzi L, Orchard S, Risse J, Robbe K, Roechert B, Thorneycroft D, Zhang Y, Apweiler R, Hermjakob H. (2007) IntAct – Open source resource for molecular interaction data. Nucleic Acids Res. 35, D561–D565.

    Article  PubMed  CAS  Google Scholar 

  10. Mishra GR, Suresh M, Kumaran K, Kannabiran N, Suresh S, Bala P, Shivakumar K, Anuradha N, Reddy R, Raghavan TM, Menon S, Hanumanthu G, Gupta M, Upendran S, Gupta S, Mahesh M, Jacob B, Mathew P, Chatterjee P, Arun KS, Sharma S, Chandrika KN, Deshpande N, Palvankar K, Raghavnath R, Krishnakanth R, Karathia H, Rekha B, Nayak R, Vishnupriya G, Kumar HG, Nagini M, Kumar GS, Jose R, Deepthi P, Mohan SS, Gandhi TK, Harsha HC, Deshpande KS, Sarker M, Prasad TS, Pandey A. (2006) Human protein reference database – 2006 update. Nucleic Acids Res. 34, D411–D414.

    Article  PubMed  CAS  Google Scholar 

  11. Sauro HM, Bergmann FT. (2008) Standards and ontologies in computational systems biology. Essays Biochem. 45, 211–222.

    Article  PubMed  CAS  Google Scholar 

  12. Boutet E, Lieberherr D, Tognolli M, Schneider M, Bairoch A. (2007) UniProtKB/Swiss-Prot: The manually annotated section of the UniProt KnowledgeBase. Methods Mol Biol. 406, 89–112.

    Article  PubMed  CAS  Google Scholar 

  13. Blake JA, Harris MA. (2008) The Gene Ontology (GO) project: structured vocabularies for molecular biology and their application to genome and expression analysis. Curr Protoc Bioinformatics. Chapter 7, Unit 7.2.

    Google Scholar 

  14. Mottaz A, Yip YL, Ruch P, Veuthey AL. (2008) Mapping proteins to disease terminologies: from UniProt to MeSH. BMC Bioinformatics. 9, Suppl 5, S3.

    Article  PubMed  Google Scholar 

  15. Chatr-Aryamontri A, Zanzoni A, Ceol A, Cesareni G. (2008) Searching the protein interaction space through the MINT database. Methods Mol Biol. 484, 305–317.

    Article  PubMed  CAS  Google Scholar 

  16. Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. (2008) miRBase: Tools for microRNA genomics. Nucleic Acids Res. 36, D154–D158.

    Article  PubMed  CAS  Google Scholar 

  17. Wakaguri H, Yamashita R, Suzuki Y, Sugano S, Nakai K. (2008) DBTSS: Database of transcription start sites, progress report 2008. Nucleic Acids Res. 36, D97–D101.

    Article  PubMed  CAS  Google Scholar 

  18. Bourne KZ, Ferrari DC, Lange-Dohna C, Rossner S, Wood TG, Perez-Polo JR. (2007) Differential regulation of BACE1 promoter activity by nuclear factor-kappaB in neurons and glia upon exposure to beta-amyloid peptides. J Neurosci Res. 85 (6), 1194–1204.

    Article  PubMed  CAS  Google Scholar 

  19. Caspi R, Foerster H, Fulcher CA, Kaipa P, Krummenacker M, Latendresse M, Paley S, Rhee SY, Shearer AG, Tissier C, Walk TC, Zhang P, Karp PD. (2008) The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res. 36, D623–D631.

    Article  PubMed  CAS  Google Scholar 

  20. Pegg SC, Brown SD, Ojha S, Seffernick J, Meng EC, Morris JH, Chang PJ, Huang CC, Ferrin TE, Babbitt PC. (2006) Leveraging enzyme structure-function relationships for functional inference and experimental design: the structure-function linkage database. Biochemistry. 45 (8), 2545–2555.

    Article  PubMed  CAS  Google Scholar 

  21. Suthram S, Shlomi T, Ruppin E, Sharan R, Ideker T. (2006) A direct comparison of protein interaction confidence schemes. BMC Bioinformatics. 7, 360.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Bureeva, S., Zvereva, S., Romanov, V., Serebryiskaya, T. (2009). Manual Annotation of Protein Interactions. In: Nikolsky, Y., Bryant, J. (eds) Protein Networks and Pathway Analysis. Methods in Molecular Biology, vol 563. Humana Press. https://doi.org/10.1007/978-1-60761-175-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-175-2_5

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-174-5

  • Online ISBN: 978-1-60761-175-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics