Integrated Protein Interaction Networks for 11 Microbes

  • Balaji S. Srinivasan
  • Antal F. Novak
  • Jason A. Flannick
  • Serafim Batzoglou
  • Harley H. McAdams
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3909)


We have combined four different types of functional genomic data to create high coverage protein interaction networks for 11 microbes. Our integration algorithm naturally handles statistically dependent predictors and automatically corrects for differing noise levels and data corruption in different evidence sources. We find that many of the predictions in each integrated network hinge on moderate but consistent evidence from multiple sources rather than strong evidence from a single source, yielding novel biology which would be missed if a single data source such as coexpression or coinheritance was used in isolation. In addition to statistical analysis, we demonstrate via case study that these subtle interactions can discover new aspects of even well studied functional modules. Our work represents the largest collection of probabilistic protein interaction networks compiled to date, and our methods can be applied to any sequenced organism and any kind of experimental or computational technique which produces pairwise measures of protein interaction.


Functional Module Protein Interaction Network Protein Pair Linkage Prediction Network Integration 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Overbeek, R., Fonstein, M., D’Souza, M., Pusch, G.D., Maltsev, N.: The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96, 2896–2901 (1999)CrossRefGoogle Scholar
  2. 2.
    McAdams, H.H., Srinivasan, B., Arkin, A.P.: The evolution of genetic regulatory systems in bacteria. Nat. Rev. Genet. 5, 169–178 (2004)CrossRefGoogle Scholar
  3. 3.
    Schena, M., Shalon, D., Davis, R.W., Brown, P.O.: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470 (1995)CrossRefGoogle Scholar
  4. 4.
    Enright, A.J., Iliopoulos, I., Kyrpides, N.C., Ouzounis, C.A.: Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90 (1999)CrossRefGoogle Scholar
  5. 5.
    Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D., Yeates, T.O.: Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. USA 96, 4285–4288 (1999)CrossRefGoogle Scholar
  6. 6.
    Srinivasan, B.S., Caberoy, N.B., Suen, G., Taylor, R.G., Shah, R., Tengra, F., Goldman, B.S., Garza, A.G., Welch, R.D.: Functional genome annotation through phylogenomic mapping. Nat. Biotechnol. 23, 691–698 (2005)CrossRefGoogle Scholar
  7. 7.
    Yu, H., Luscombe, N.M., Lu, H.X., Zhu, X., Xia, Y., Han, J.D.J., Bertin, N., Chung, S., Vidal, M., Gerstein, M.: Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res. 14, 1107–1118 (2004)CrossRefGoogle Scholar
  8. 8.
    Bowers, P.M., Cokus, S.J., Eisenberg, D., Yeates, T.O.: Use of logic relationships to decipher protein network organization. Science 306, 2246–2249 (2004)CrossRefGoogle Scholar
  9. 9.
    Pazos, F., Valencia, A.: Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng. 14, 609–614 (2001), Evaluation StudiesGoogle Scholar
  10. 10.
    Gerstein, M., Lan, N., Jansen, R.: Proteomics. Integrating interactomes. Science 295, 284–287 (2002), CommentGoogle Scholar
  11. 11.
    Hoffmann, R., Valencia, A.: Protein interaction: same network, different hubs. Trends Genet 19, 681–683 (2003)CrossRefGoogle Scholar
  12. 12.
    Jansen, R., Yu, H., Greenbaum, D., Kluger, Y., Krogan, N.J., Chung, S., Emili, A., Snyder, M., Greenblatt, J.F., Gerstein, M.: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449–453 (2003), Evaluation StudiesGoogle Scholar
  13. 13.
    Troyanskaya, O.G., Dolinski, K., Owen, A.B., Altman, R.B., Botstein, D.: A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). Proc. Natl. Acad. Sci. USA 100, 8348–8353 (2003)CrossRefGoogle Scholar
  14. 14.
    Lee, I., Date, S.V., Adai, A.T., Marcotte, E.M.: A probabilistic functional network of yeast genes. Science 306, 1555–1558 (2004)CrossRefGoogle Scholar
  15. 15.
    Tanay, A., Sharan, R., Kupiec, M., Shamir, R.: Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc. Natl. Acad. Sci. USA 101, 2981–2986 (2004)CrossRefGoogle Scholar
  16. 16.
    Wong, S.L., Zhang, L.V., Tong, A.H.Y., Li, Z., Goldberg, D.S., King, O.D., Lesage, G., Vidal, M., Andrews, B., Bussey, H., Boone, C., Roth, F.P.: Combining biological networks to predict genetic interactions. Proc. Natl. Acad. Sci. USA 101, 15682–15687 (2004)CrossRefGoogle Scholar
  17. 17.
    Lu, L.J., Xia, Y., Paccanaro, A., Yu, H., Gerstein, M.: Assessing the limits of genomic data integration for predicting protein networks. Genome Res. 15, 945–953 (2005)CrossRefGoogle Scholar
  18. 18.
    Friedman, A., Perrimon, N.: Genome-wide high-throughput screens in functional genomics. Curr. Opin. Genet Dev. 14, 470–476 (2004)CrossRefGoogle Scholar
  19. 19.
    Hartwell, L.H., Hopfield, J.J., Leibler, S., Murray, A.W.: From molecular to modular cell biology. Nature 402, 47–52 (1999)CrossRefGoogle Scholar
  20. 20.
    Schaffer, A.A., Aravind, L., Madden, T.L., Shavirin, S., Spouge, J.L., Wolf, Y.I., Koonin, E.V., Altschul, S.F.: Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 2994–3005 (2001)CrossRefGoogle Scholar
  21. 21.
    Tatusov, R.L., Fedorova, N.D., Jackson, J.D., Jacobs, A.R., Kiryutin, B., Koonin, E.V., Krylov, D.M., Mazumder, R., Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S., Smirnov, S., Sverdlov, A.V., Vasudevan, S., Wolf, Y.I., Yin, J.J., Natale, D.A.: The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003)CrossRefGoogle Scholar
  22. 22.
    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)CrossRefGoogle Scholar
  23. 23.
    Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., Apweiler, R.: The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res. 32, 262–266 (2004)CrossRefGoogle Scholar
  24. 24.
    Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., Hattori, M.: The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, 277–280 (2004)CrossRefGoogle Scholar
  25. 25.
    Bader, G.D., Hogue, C.W.V.: Analyzing yeast protein-protein interaction data obtained from different sources. Nat. Biotechnol. 20, 991–997 (2002)CrossRefGoogle Scholar
  26. 26.
    Gray, A.G., Moore, A.W.: ‘n-body’ problems in statistical learning. In: NIPS, pp. 521–527 (2000)Google Scholar
  27. 27.
    Ihler, A., Sudderth, E., Freeman, W., Willsky, A.: Efficient multiscale sampling from products of gaussian mixtures. In: NIPS (2003)Google Scholar
  28. 28.
    Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)zbMATHMathSciNetGoogle Scholar
  29. 29.
    Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley-Interscience Publication, New York (2000)Google Scholar
  30. 30.
    Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)CrossRefGoogle Scholar
  31. 31.
    Szymanski, C.M., Logan, S.M., Linton, D., Wren, B.W.: Campylobacter–a tale of two protein glycosylation systems. Trends Microbiol. 11, 233–238 (2003)Google Scholar
  32. 32.
    Wacker, M., Linton, D., Hitchen, P.G., Nita-Lazar, M., Haslam, S.M., North, S.J., Panico, M., Morris, H.R., Dell, A., Wren, B.W., Aebi, M.: N-linked glycosylation in Campylobacter jejuni and its functional transfer into E. coli. Science 298, 1790–1793 (2002)CrossRefGoogle Scholar
  33. 33.
    Linton, D., Dorrell, N., Hitchen, P.G., Amber, S., Karlyshev, A.V., Morris, H.R., Dell, A., Valvano, M.A., Aebi, M., Wren, B.W.: Functional analysis of the Campylobacter jejuni N-linked protein glycosylation pathway. Mol. Microbiol. 55, 1695–1703 (2005)CrossRefGoogle Scholar
  34. 34.
    Karlyshev, A.V., Everest, P., Linton, D., Cawthraw, S., Newell, D.G., Wren, B.W.: The Campylobacter jejuni general glycosylation system is important for attachment to human epithelial cells and in the colonization of chicks. Microbiology 150, 1957–1964 (2004)CrossRefGoogle Scholar
  35. 35.
    Campo, N., Tjalsma, H., Buist, G., Stepniak, D., Meijer, M., Veenhuis, M., Westermann, M., Muller, J.P., Bron, S., Kok, J., Kuipers, O.P., Jongbloed, J.D.H.: Subcellular sites for bacterial protein export. Mol. Microbiol. 53, 1583–1599 (2004)CrossRefGoogle Scholar
  36. 36.
    van den Ent, F., Amos, L.A., Lowe, J.: Prokaryotic origin of the actin cytoskeleton. Nature 413, 39–44 (2001)CrossRefGoogle Scholar
  37. 37.
    Gitai, Z., Dye, N., Shapiro, L.: An actin-like gene can determine cell polarity in bacteria. Proc. Natl. Acad. Sci. USA 101, 8643–8648 (2004)CrossRefGoogle Scholar
  38. 38.
    Kurner, J., Frangakis, A.S., Baumeister, W.: Cryo-electron tomography reveals the cytoskeletal structure of Spiroplasma melliferum. Science 307, 436–438 (2005)CrossRefGoogle Scholar
  39. 39.
    Gerdes, K., Moller-Jensen, J., Ebersbach, G., Kruse, T., Nordstrom, K.: Bacterial mitotic machineries. Cell 116, 359–366 (2004)CrossRefGoogle Scholar
  40. 40.
    Cabeen, M.T., Jacobs-Wagner, C.: Bacterial cell shape. Nat. Rev. Microbiol. 3, 601–610 (2005)CrossRefGoogle Scholar
  41. 41.
    Vrontou, E., Economou, A.: Structure and function of SecA, the preprotein translocase nanomotor. Biochim. Biophys. Acta 1694, 67–80 (2004)CrossRefGoogle Scholar
  42. 42.
    Kruse, T., Bork-Jensen, J., Gerdes, K.: The morphogenetic MreBCD proteins of Escherichia coli form an essential membrane-bound complex. Mol. Microbiol. 55, 78–89 (2005)CrossRefGoogle Scholar
  43. 43.
    Vidalain, P.O., Boxem, M., Ge, H., Li, S., Vidal, M.: Increasing specificity in high-throughput yeast two-hybrid experiments. Methods 32, 363–370 (2004)CrossRefGoogle Scholar
  44. 44.
    McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions. John Wiley and Sons, Chichester (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Balaji S. Srinivasan
    • 1
    • 2
  • Antal F. Novak
    • 3
  • Jason A. Flannick
    • 3
  • Serafim Batzoglou
    • 3
  • Harley H. McAdams
    • 2
  1. 1.Department of Electrical Engineering 
  2. 2.Department of Developmental Biology 
  3. 3.Department of Computer ScienceStanford UniversityStanfordUSA

Personalised recommendations