Prediction of Protein Complexes Based on Protein Interaction Data and Functional Annotation Data Using Kernel Methods

  • Shi-Hua Zhang
  • Xue-Mei Ning
  • Hong-Wei Liu
  • Xiang-Sun Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4115)


Prediction of protein complexes is a crucial problem in computational biology. The increasing amount of available genomic data can enhance the identification of protein complexes. Here we describe an approach for predicting protein complexes based on integration of protein-protein interaction (PPI) data and protein functional annotation data. The basic idea is that proteins in protein complexes often interact with each other and protein complexes exhibit high functional consistency/even multiple functional consistency. We create a protein-protein relationship network (PPRN) via a kernel-based integration of these two genomic data. Then we apply the MCODE algorithm on PPRN to detect network clusters as numerically determined protein complexes. We present the results of the approach to yeast Sacchromyces cerevisiae. Comparison with well-known experimentally derived complexes and results of other methods verifies the effectiveness of our approach.


Protein Complex Functional Annotation Kernel Method Protein Interaction Network Protein Interaction Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sear, R.P.: Specific Protein-Protein Binding in Many-componet Mixtures of Proteinsn. Phys. Biol. 1, 53–60 (2004)CrossRefGoogle Scholar
  2. 2.
    Bader, G.D., Hogue, C.W.: An Automated Method for Finding Molecular Complexes in Large Protein Interaction Networks. BMC Bioinformatics 4, 2 (2003)CrossRefGoogle Scholar
  3. 3.
    King, A.D., Pržulj, N., Jurisica, I.: Protein Complex Prediction via Cost-based Clustering. Bioinformatics 20, 3013–3020 (2004)CrossRefGoogle Scholar
  4. 4.
    Li, X.L., Tan, S.H., Foo, C.S., Ng, S.K.: Interaction Graph Mining for Protein Complexes Using Local Clique Merging. Genome Informatics 16, 260–269 (2005)Google Scholar
  5. 5.
    Yamanishi., Y., Vert, J.P., Kanehisa, M.: Protein Network Inference from Multiple Genomic Data: a Supervised Approach. Bioinformatics 20, i363–i370 (2004)CrossRefGoogle Scholar
  6. 6.
    Lanckriet, G.R., De Bie, T.D., Cristianini, N., Jordan, M.I., Noble, W.S.: A Statistical Framework for Genomic Data Fusion. Bioinformatics 20, 2626–2635 (2004)CrossRefGoogle Scholar
  7. 7.
    Kondor, R.I., Lafferty, J.: Diffusion Kernels on Graphs and Other Discrete Input. In: Proceedings of the 19th International Conference on Machine Learning, pp. 315–322. Morgan Kaufmann, University of South Wales, Sydney, Australia (2002)Google Scholar
  8. 8.
    Ito, T., Tashiro, K., Muta, S., Ozawa, R., Chiba, T., Nishizawa, M., Yamamoto, K., Kuhara, S., Sakaki, Y.: Toward a Protein-Protein Interaction Map of the Budding Yeast: a Comprehensive System to Examine Two-hybrid Interactions in All Possible Combinations between the Yeast Proteins. Proc. Natl Acad. Sci., USA 97, 1143–1147 (2000)CrossRefGoogle Scholar
  9. 9.
    Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., et al.: A Comprehensive Analysis of Protein-Protein Interactions in Saccharomyces Cerevisiae. Nature 403, 623–627 (2000)CrossRefGoogle Scholar
  10. 10.
    Gavin, A.C., Bosche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J.M., Michon, A.M., Cruciat, C.M., et al.: Functional Organization of the Yeast Proteome by Systematic Analysis of Protein Complexes. Nature 415, 141–147 (2002)CrossRefGoogle Scholar
  11. 11.
    Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A., Taylor, P., Bennett, K., Boutilier, K., et al.: Systematic Identification of Protein Complexes in Saccharomyces Cerevisiae by Mass Spectrometry. Nature 415, 180–183 (2002)CrossRefGoogle Scholar
  12. 12.
    Ruepp, A., Zollner, A., Maier, D., Albermann, K., Hani, J., Mokrejs, M., Tetko, I., Guldener, U., Mannhaupt, G., Munsterkotter, M., et al.: The FunCat, a Functional Annotation Scheme for Systematic Classification of Proteins from Whole Genomes. Nucleic Acids Res. 32, 5539–5545 (2004)CrossRefGoogle Scholar
  13. 13.
    Mewes, H.W., Frishman, D., Guldener, U., Mannhaupt, G., Mayer, K., Mokrejs, M., Morgenstern, B., Munsterkotter, M., Rudd, S., Weil, B.: MIPS: a Database for Genomes and Protein Sequences. Nucleic Acids Res. 30, 31–34 (2002)CrossRefGoogle Scholar
  14. 14.
    Barabási, A.-L., Oltvai, Z.N.: Network Biology: Understanding the Cell’s Functional Organization. Nature Rev. Genet. 5, 101–114 (2004)CrossRefGoogle Scholar
  15. 15.
    Spirin, V., Mirny, L.A.: Protein Complexes and Functional Modules in Molecular Networks. Proc. Natl Acad. Sci., USA 100, 12123–12126 (2003)CrossRefGoogle Scholar
  16. 16.
    Segal, E., Wang, H., Koller, D.: Discovering Molecular Pathways from Protein Interaction and Gene Expression Data. Bioinformatics 19, i264–i272 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Shi-Hua Zhang
    • 1
  • Xue-Mei Ning
    • 1
  • Hong-Wei Liu
    • 2
  • Xiang-Sun Zhang
    • 1
  1. 1.Institute of Applied MathematicsAcademy of Mathematics and Systems Science Chinese Academy of SciencesBeijingChina
  2. 2.School of EconomicsRenmin University of ChinaBeijingChina

Personalised recommendations