Skip to main content

Inferring Protein–Protein Interactions from Multiple Protein Domain Combinations

  • Protocol
  • First Online:
Book cover Computational Systems Biology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 541))

Abstract

The ever accumulating wealth of knowledge about protein interactions and the domain architecture of involved proteins in different organisms offers ways to understand the intricate interplay between interactome and proteome. Ultimately, the combination of these sources of information will allow the prediction of interactions among proteins where only domain composition is known. Based on the currently available protein–protein interaction and domain data of Saccharomyces cerevisiae and Drosophila melanogaster we introduce a novel method, Maximum Specificity Set Cover (MSSC), to predict potential protein–protein interactions. Utilizing interactions and domain architectures of domains as training sets, this algorithm employs a set cover approach to partition domain pairs, which allows the explanation of the underlying protein interaction to the largest degree of specificity. While MSSC in its basic version only considers domain pairs as the driving force between interactions, we also modified the algorithm to account for combinations of more than two domains that govern a protein–protein interaction. This approach allows us to predict the previously unknown protein–protein interactions in S. cerevisiae and D. melanogaster, with a degree of sensitivity and specificity that clearly outscores other approaches. As a proof of concept we also observe high levels of co-expression and decreasing GO distances between interacting proteins. Although our results are very encouraging, we observe that the quality of predictions significantly depends on the quality of interactions, which were utilized as the training set of the algorithm. The algorithm is part of a Web portal available at http://ppi.cse.nd.edu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Rain JC, Selig L, DeReuse H, Battaglia V, Reverdy C, Simon S, Lenzen G, Petel F, Wo jcik J, Schächter V, Chemama Y, Labigne A, Legrain P. The protein–protein interaction map of Helicobacter pylori. Nature 2001, 409, 211–215.

    Article  PubMed  CAS  Google Scholar 

  2. Ito T, Tashiro K, Muta S, Ozawa R, Chiba T, Nishizawa M, Yamamoto K, Kuhara S, Sakaki Y. Towards a protein–protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc Nat Acad Sci USA 2000, 97, 1143–1147.

    Article  PubMed  CAS  Google Scholar 

  3. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Nat Acad Sci USA 2001, 98, 4569–4574.

    Article  PubMed  CAS  Google Scholar 

  4. Uetz P, Giot L, Cagney G, Mansfield T, Judson R, Knight J, Lockshorn D, Narayan V, Srinivasan M, Pochart P, Qureshi-Emili A, Li Y, Godwin B, Conover D, Kalbfleisch T, Vijayadamodar G, Yang M, Johnston M, Fields S, Rothberg J. A comprehensive analysis of protein–protein interactions of Saccharomyces cerevisiae. Nature 2000, 403, 623–627.

    Article  PubMed  CAS  Google Scholar 

  5. Gavin A, Bösche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick J, Michon AM, Cruciat CM, Remor M, Böfert C, Schelder M, Bra jenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley R, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 2002, 415, 141–147.

    Article  PubMed  CAS  Google Scholar 

  6. Ho Y, Gruhler A, Heilbut A, Bader G, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutillier K, coauthors. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002, 415, 180–183.

    Article  PubMed  CAS  Google Scholar 

  7. Jeong H, Mason S, Barabási AL, Oltvai Z. Lethality and centrality in protein networks. Nature 2001, 411, 41–42.

    Article  PubMed  CAS  Google Scholar 

  8. Walhout A, Sordella R, Lu X, Hartley J, Temple G, Brasch M, Thierry-Mieg N, Vidal M. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 2000, 287, 116–122.

    Article  PubMed  CAS  Google Scholar 

  9. Li S, Armstrong C, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO, Han JD, Chesneau A, Ha T, et al. A map of the interactome network of the metazoan C. elegans. Science 2004, 303, 540–543.

    Article  PubMed  CAS  Google Scholar 

  10. Giot L, Bader J, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao Y, Ooi C, Godwin B, Vitols E, Vijayadamodar G, Pochart P, Machineni H, Welsh M, Kong Y, Zerhusen B, Malcolm R, Varrone Z, Collis A, Minto M, Burgess S, McDaniel L, Stimpson E, Spriggs F, Williams J, Neurath K, Ioime N, Agee M, Voss E, Furtak K, Renzulli R, Aanensen N, Carrolla S, Bickelhaupt E, Lazovatsky Y, DaSilva A, Zhong J, Stanyon C, Finley R Jr, White K, Braverman M, Jarvie T, Gold S, Leach M, Knight J, Shimkets R, McKenna M, Chant J, Rothberg J. A protein interaction map of Drosophila melanogaster. Science 2004, 302, 1727–1736.

    Article  Google Scholar 

  11. Enright A, Iliopoulos I, Kyrpides N, Ouzounis C. Protein interaction maps for complete genomes based on gene fusion events. Nature 1999, 402, 86–90.

    Article  PubMed  CAS  Google Scholar 

  12. Marcotte E, Pellegrini M, Thompson M, Yeates T, Eisenberg D. A combined algorithm for genomewide prediction of protein function. Nature 1999, 402, 83–86.

    Article  PubMed  CAS  Google Scholar 

  13. Pellegrini M, Marcotte E, Thompson M, Eisenberg D, Yeates T. Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles. Proc Natl Acad Sci USA 1999, 96, 4285–4288.

    Article  PubMed  CAS  Google Scholar 

  14. Wo jcik J, Schächter V. protein–protein interaction map inference using interacting domain profile pairs. Bioinformatics 2001, 17, 296S–305S.

    Google Scholar 

  15. Deng M, Mehta S, Sun F, Cheng T. Inferring domain-domain interactions from protein–protein interactions. Genome Res 2002, 12, 1540–1548.

    Article  PubMed  CAS  Google Scholar 

  16. Iossifov I, Krauthammer M, Friedman C, Hatzivassiloglou V, Bader J, White K, Rzhetsky A. Probabilistic inference of molecular networks from noisy data sources. Bioinformatics 2004, 20, 1205–1213.

    Article  PubMed  CAS  Google Scholar 

  17. Sprinzak E, Margalit H. Correlated sequence-signature as markers of protein–protein interaction. J Mol Biol 2001, 311, 681–692.

    Article  PubMed  CAS  Google Scholar 

  18. Goldberg D, Roth F. Assessing experimentally derived interactions in a small world. Proc Natl Acad Sci USA 2003, 100, 4372–4376.

    Article  PubMed  CAS  Google Scholar 

  19. Tong A, Drees B, Nardelli G, Bader G, Branetti B, Castagnoli L, Evangelista M, ferracuti S, Nelson B, Apoluzzi S, et al. A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 2002, 295, 321–324.

    Article  PubMed  CAS  Google Scholar 

  20. Albert I, Albert R. Conserved network motifs allow protein–protein interaction prediction. Bioinformatics 2004, 20, 3346–3352.

    Article  PubMed  CAS  Google Scholar 

  21. Zanzoni A, Montecchi-Palazzi L, Quondam M, Ausiello G, Helmer-Citterich M, Cesareni G. Mint – a molecular interaction database. FEBS Lett. 513, 2002, 135–140.

    Article  PubMed  CAS  Google Scholar 

  22. Mewes HW, D Frishman UB, Mannhaupt G, Mayer K, Mokrejs M, Morgenstern B, Munsterkotter M, Rudd S, Weil B. MIPS: A database for genomes and protein sequences. Nucl Acids Res 2002, 30, 31–34.

    Article  PubMed  CAS  Google Scholar 

  23. Bader G, Donaldson I, Wolting C, Ouellette B, Pawson T, Hogue C. BIND – The biomolecular interaction network database. Nucl Acids Res 2001, 29, 242–245.

    Article  PubMed  CAS  Google Scholar 

  24. Xenarios I, Salwinski L, Duan X, Higney P, Kim SM, Eisenberg D. Dip, the database of interacting proteins: A research tool for studying cellular networks of protein interactions. Nucl Acids Res 2002, 30, 303–305.

    Article  PubMed  CAS  Google Scholar 

  25. Bader J, Chaudhuri D, Rothberg J, Chant J. Gaining confidence in high-throughput protein interaction networks. Nature Biotech 2004, 22, 78–85.

    Article  CAS  Google Scholar 

  26. Apweiler R, Bairoch A, Wu C, Barker W, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin M, Natale D, O’Donovan C, Redaschi N, Yeh L. Uniprot: The universal protein knowledgebase. Nucl Acids Res 2004, 32, D115–D119.

    Article  PubMed  CAS  Google Scholar 

  27. Mulder N, Apweiler R, Attwood T, Bairoch A, Barrell D, Bateman A, Binns D, Biswas M, Bradley P, Bork P, Bucher P, Copley R, Courcelle E, Das U, Durbin R, LFalquet, Fleischmann W, Griffiths-Jones S, Haft D, Harte N, Hulo N, Kahn D, Kanapin A, MKrestyaninova, Lopez R, Letunic I, Lonsdale D, Silventoinen V, Orchard S, Pagni M, Peyruc D, Ponting C, Selengut J, Servant F, Sigrist C, Vaughan R, Zdobnov E. The interpro database, 2003 brings increased coverage and new features. Nucl Acids Res 2003, 31, 315–318.

    Article  PubMed  CAS  Google Scholar 

  28. Kriventseva E, Fleischmann W, Zdobnov E, Apweiler R. CluSTr: A database of clusters of SWISS-PROT+TrEMBL proteins. Nucl Acids Res 2001, 29, 33–36.

    Article  PubMed  CAS  Google Scholar 

  29. Consortium G. The gene ontology (go) database and information resource. Nucl Acids Res 2004, 32, D258–D261.

    Article  Google Scholar 

  30. Kersey P, Duarte J, Williams A, Apweiler R, Karavidopoulou Y, Birney E. The international protein index: An integrated database for proteomics experiments. Proteomics 2004, 4, 1985–1988.

    Article  PubMed  CAS  Google Scholar 

  31. Bateman A, Coin L, Durbin R, Finn R, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer E, Studholme D, Yeats C, Eddy S. The PFAM protein families database. Nucl Acids Res 2004, 32, D138–D141.

    Article  PubMed  CAS  Google Scholar 

  32. Grigoriev A. A relationship between gene expression and protein interactions on the proteome scale: Analysis of the bacteriophage t7 and the yeast Saccharomyces cerevisiae. Nucl Acids Res 2001, 29: 3513–3519.

    Article  PubMed  CAS  Google Scholar 

  33. Ge H, Ziu L, Church G, Vidal M. Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat Genet 2001, 29, 482–486.

    Article  PubMed  CAS  Google Scholar 

  34. Martin D, Brun C, Remy E, Mouren P, Thieffry D, Jacq B. GOToolBox: Functional analysis of gene datasets based on gene ontology. Genome Biol. 2004, 5, R101.

    Google Scholar 

  35. Doolittle R. The multiplicity of domains in proteins. Ann Rev Biochem 1995, 64, 287–314.

    Article  PubMed  CAS  Google Scholar 

  36. Li WH, Gu Z, Wang H. Evolutionary analyses of the human genome. Nature 2001, 409, 847–849.

    Article  PubMed  CAS  Google Scholar 

  37. Johnson DS. Approximation algorithms for combinatorial problems. J Comput System Sci 1974, 9, 256–278.

    Article  Google Scholar 

  38. Cormen TH, Leiserson CE, Rivest RL, Stein C. Introduction to Algorithms, Second Edition. McGraw Hill Boston, MA, 2001.

    Google Scholar 

  39. Huang C, Morcos F, Kanaan S, Wuchty S, Chen D, Izaguirre J. Predicting protein–protein interactions from protein domains using a set cover approach. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2007, 4, 78–87.

    Article  PubMed  CAS  Google Scholar 

  40. von Mering C, Krause R, Snel B, Cornell M, Oliver S, Fields S, Bork P. Comparative assessment of large-scale data sets of protein–protein interactions. Nature 2003, 31, 399–403.

    Google Scholar 

  41. Wuchty S. Topology and evolution in the yeast protein interaction network. Genome Res 2004, 14, 1310–1314.

    Article  PubMed  CAS  Google Scholar 

  42. Fraser H, Hirsh A, Steinmetz L, Scharfe C, Feldman M. Evolutionary rate in the protein interaction network. Science 2002, 296, 750–752.

    Article  PubMed  CAS  Google Scholar 

  43. Wuchty S, Oltvai Z, Barabaśi AL. Evolutionary conservation of motif constituents within the yeast protein interaction network. Nat Genet 2003, 35, 176–179.

    Article  PubMed  CAS  Google Scholar 

  44. Wuchty S, Barabási AL, Ferdig M. Stable evolutionary signal in a yeast protein int eraction network. BMC Evol Biol. 2006, 6, pp. 8.

    Google Scholar 

Download references

Acknowledgments

Danny Chen was supported in part by the NSF under Grant CCF-0515203. Jesús Izaguirre was supported by partial funding from NSF grants IOB-0313730, CCR-0135195, and DBI- 0450067. Stefan Wuchty was supported by the Northwestern Institute of Complexity (NICO).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Kanaan, S.P., Huang, C., Wuchty, S., Chen, D.Z., Izaguirre, J.A. (2009). Inferring Protein–Protein Interactions from Multiple Protein Domain Combinations. In: Ireton, R., Montgomery, K., Bumgarner, R., Samudrala, R., McDermott, J. (eds) Computational Systems Biology. Methods in Molecular Biology, vol 541. Humana Press. https://doi.org/10.1007/978-1-59745-243-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-59745-243-4_3

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-905-5

  • Online ISBN: 978-1-59745-243-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics