Journal of Systems Science and Complexity

, Volume 23, Issue 5, pp 917–930 | Cite as

On network-based kernel methods for protein-protein interactions with applications in protein functions prediction

  • Limin Li
  • Waiki Ching
  • Yatming Chan
  • Hiroshi Mamitsuka
Article

Abstract

Predicting protein functions is an important issue in the post-genomic era. This paper studies several network-based kernels including local linear embedding (LLE) kernel method, diffusion kernel and laplacian kernel to uncover the relationship between proteins functions and protein-protein interactions (PPI). The author first construct kernels based on PPI networks, then apply support vector machine (SVM) techniques to classify proteins into different functional groups. The 5-fold cross validation is then applied to the selected 359 GO terms to compare the performance of different kernels and guilt-by-association methods including neighbor counting methods and Chi-square methods. Finally, the authors conduct predictions of functions of some unknown genes and verify the preciseness of our prediction in part by the information of other data source.

Key words

Diffusion kernel kernel method Laplacian kernel local linear embedding (LLE) kernel protein function prediction support vector machine. 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    W. Kim, C. Krumpelman, and E. Marcotte, Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy, Genome Biology, 2008, 9(Suppl 1): S5.CrossRefGoogle Scholar
  2. [2]
    E. Marcotte, M. Pellegrini, M. Thompson, et al., A combined algorithm for genome-wide prediction of protein function, Nature, 1999, 402: 83–86.CrossRefGoogle Scholar
  3. [3]
    E. Marcotte, M. Pellegrini, N. H. Ricq, et al., Detecting protein function and protein-protein interactions from genome sequences, Science, 1999, 285: 751–753.CrossRefGoogle Scholar
  4. [4]
    J. Watson, R. Laskowski, and J. Thornton, Predicting protein function from sequence and structural data, Current Opinion in Structural Biology, 2005, 15: 275–284.CrossRefGoogle Scholar
  5. [5]
    X. Zhao, Y. Wang, L. Chen, and K. Aihara, Gene function prediction using labeled and unlabeled data, BMC Bioinformatics, 2008, 9: 57.CrossRefGoogle Scholar
  6. [6]
    M. Brown, W. Grundy, D. Lin, et al., Knowledge-based analysis of microarray gene expression data by using support vector machines, Proc. Natl. Acad. Sci., 2000, 97: 262–267.CrossRefGoogle Scholar
  7. [7]
    M. Eisen, P. Spellman, P. Brown, and D. Bostein, Cluster analysis and display of genome-wide expression patterns, Proc. Natl. Acad. Sci., 1998, 95: 14863–14868.CrossRefGoogle Scholar
  8. [8]
    W. Ching, L. Li, N. Tsing, et al., A weighted local least squares imputation method for missing value estimation in microarray gene expression data, International Journal of Data Mining and Bioinformatics, 2010, 4(3): 331–347.CrossRefGoogle Scholar
  9. [9]
    B. Schwikowski, P. Uetz, and S. Fields, A network of protein protein interactions in yeast, Nat. Biotechnol, 2000, 18: 1257–1261.CrossRefGoogle Scholar
  10. [10]
    H. Hishigaki, K. Nakai, T. Ono, et al., Assessment of prediction accuracy of protein function from protein-protein interaction data, Yeast, 2001, 18: 523–531.CrossRefGoogle Scholar
  11. [11]
    A. Vazquez, A. Flammini, A. Maritan, and A. Vespignani, Global protein function prediction from proteinCprotein interaction networks, Nat. Biotechnol., 2003, 21: 697–700.CrossRefGoogle Scholar
  12. [12]
    U. Karaoz, T. Murali, S. Letovsky, et al., Whole-genome annotation by using evidence integration in functional-linkage networks, Proc. Natl. Acad. Sci., 2004, 101: 2888–2893.CrossRefGoogle Scholar
  13. [13]
    E. Nabieva, K. Jim, A. Agarwal, et al., Whole-proteome prediction of protein function via graphtheoretic analysis of interaction maps, Bioinformatics, 2005, 21(Suppl 1): 302–310.CrossRefGoogle Scholar
  14. [14]
    M. Deng, Z. Tu, F. Sun, and T. Chen, Mapping gene ontology to proteins based on protein-protein interaction data, Bioinformatics, 2003, 20: 895–902.CrossRefGoogle Scholar
  15. [15]
    J. David and J. Robert, A simple generalisation of the area under the ROC curve for multiple class classification problems, Machine Learning, 2001, 45: 171–186.MATHCrossRefGoogle Scholar
  16. [16]
    H. Lee, Z. Tu, M. Sun, et al., Diffusion Kernel-based logistic regression models for protein function prediction, OMICS, a Journal of Integrative Biology, 2006, 1(10): 40–55.CrossRefGoogle Scholar
  17. [17]
    R. Kondor and J. Lafferty, Diffusion kernels on graphs and other discrete input spaces, Proc Int Conf Machine Learning, 2002: 315–322.Google Scholar
  18. [18]
    R. Lanckriet, M. Deng, M. Cristianini, et al., Kernel-based data fusion and its application to protein function prediction in yeast, Proceedings of the Pacific Symposium on Biocomputing, 2004, January 3–8, 300–311.Google Scholar
  19. [19]
    J. Ham, D. Lee, S. Mika, and B. Scholkopf, A kernel view of the dimensionality reduction of manifolds, Proceedings of the Twenty-First International Conference on Machine Learning, (AAAI Press, Menlo Park, CA), 2004: 47–54.CrossRefGoogle Scholar
  20. [20]
    R. Sam and S. Lawrence, Nonlinear dimensionality reduction by locally linear embedding, Science, 2000, 290: 2323–2326.CrossRefGoogle Scholar
  21. [21]
  22. [22]
    U. Guldener, M. Munsterkotter, M. Oesterheld, et al., MPact: The MIPS protein interaction resource on yeast, Nucleic Acids Res., 2006, 34: 436–441.CrossRefGoogle Scholar
  23. [23]
    A. Ruepp, et al., The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes, Nucl. Acids. Res., 2004, 32: 5539–5545.CrossRefGoogle Scholar
  24. [24]
  25. [25]
    W. Ching, L. Li, Y. Chan, and H. Mamitsika, A Study of network-based kernel methods on protein-protein interaction for protein functions prediction, The Third International Symposium on Optimization and Systems Biology (OSB 2009), Lecture Notes in Operations Research, Series 11, 2009, 11: 25–32.Google Scholar

Copyright information

© Institute of Systems Science, Academy of Mathematics and Systems Science, CAS and Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Limin Li
    • 1
  • Waiki Ching
    • 2
  • Yatming Chan
    • 2
  • Hiroshi Mamitsuka
    • 3
  1. 1.Department of MathematicsXi’an Jiaotong UniversityXi’anChina
  2. 2.Advanced Modeling and Applied Computing Laboratory, Department of MathematicsThe University of Hong KongHong KongChina
  3. 3.Bioinformatics Center, Institute for Chemical ResearchKyoto UniversityKyotoJapan

Personalised recommendations