Comparative Analysis of Classification Methods for Protein Interaction Verification System

Lee, Min Su; Park, Seung Soo

doi:10.1007/11890393_24

Comparative Analysis of Classification Methods for Protein Interaction Verification System

Min Su Lee¹⁸ &
Seung Soo Park¹⁸

Conference paper

744 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4243))

Abstract

A comparative study for assessing the reliability of protein-protein interactions in a high-throughput dataset is presented. We use various state-of-the-art classification algorithms to distinguish true interacting protein pairs from noisy data using the empirical knowledge about interacting proteins. Then we compare the performance of classifiers with various criteria. Experimental results show that classification algorithms provide very powerful tools in distinguishing true interacting protein pairs from noisy protein-protein interaction dataset. Furthermore, in the data setting with lots of missing values like protein-protein interaction dataset, K-Nearest Neighborhood and Decision Tree algorithms show best performance among other methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Vazquez, A., Flammini, A., Maritan, A., Vespignani, A.: Global protein function prediction from protein-protein interaction networks. Nat. Biotechnol. 21, 697–700 (2003)
Article Google Scholar
Steffen, M., Petti, A., Aach, J., D’haeseleer, P., Church, G.: Automated modelling of signal transduction networks. BMC Bioinformatics 3, 34–44 (2002)
Article Google Scholar
Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., et al.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)
Article Google Scholar
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., et al.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. PNAS 98, 4569–4574 (2001)
Article Google Scholar
Gavin, A.C., Bosche, M., Krause, R., et al.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002)
Article Google Scholar
Ho, Y., Gruhler, A., Heilbut, A., et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002)
Article Google Scholar
von Mering, C., Krause, R., Snel, B., Cornell, M., et al.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002)
Article Google Scholar
Sprinzak, E., Sattath, S., Margalit, H.J.: How reliable are experimental protein-protein interaction data? Mol. Biol. 327, 919–923 (2003)
Article Google Scholar
Lee, M.S., Park, S.S., Kim, M.K.: A Protein verification system based on a neural network algorithm. IEEE Computational Systems Bioinformatics, 151–154 (August 2005)
Google Scholar
Mattews, L.R., Vaglio, P., Reboul, J., Ge, H., et al.: Identification of Potential Interaction Networks Using Sequence-Based Searches for Conserved Protein-Protein Interactions or Interologs. Genome. Res. 11, 2120–2126 (2001)
Article Google Scholar
Ge, H., Liu, Z., Church, G.M., Vidal, M.: Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat. Genet. 29, 482–486 (2001)
Article Google Scholar
Kemmeren, P., van Berkum, N., Vilo, J., Bijma, T., et al.: Protein interaction verification and functional annotation by integrated analysis of genome-scale data. Mol. Cell 9, 1133–1143 (2002)
Article Google Scholar
Gygi, S., Rochon, Y., Franza, B.R., Aebersold, R.: Correlation between protein and mRNA abundance in yeast. MCB 19, 1720–1730 (1999)
Google Scholar
Jasen, R., Greenbaum, D., Gerstein, M.: Relating whole-genome expression data with protein-protein interaction. Genome Res. 12, 37–46 (2002)
Article Google Scholar
Bhardwaj, N., Lu, H.: Correlation between gene expression profiles and protein-protein interactions within and across genomes. Bioinformatics 21, 2730–2738 (2005)
Article Google Scholar
Mewes, H.W., Frishman, D., Guldener, U., Mannhaupt, G., et al.: MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 30, 31–34 (2002)
Article Google Scholar
Sato, T., Yamanishi, Y., Kanehisa, M., Toh, H.: The inference of protein-protein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships. Bioinformatics 21, 3482–3489 (2005)
Article Google Scholar
Ruepp, A., Zollner, A., Maier, D., Albermann, K., et al.: The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res. 32, 5539–5545 (2004)
Article Google Scholar
Huh, W.K., Falvo, J.V., Gerke, L.C., et al.: Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003)
Article Google Scholar
Jeong, H., Mason, S.P., Barabasi, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)
Article Google Scholar
Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, A.N., Barabasi, A.L.: Hierarchical organization of modularity in metabolic networks. Science 297, 1551–1555 (2002)
Article Google Scholar
Saito, R., Suzuki, H., Hayashizaki, Y.: Construction of reliable protein-protein interaction networks with a new interaction generality measure. Bioinformatics 19, 756–763 (2003)
Article Google Scholar
Quinlan, R.: C4.5: Programs for machine learning. Morgan Kaufmann Publishers, San Mateo (1993)
Google Scholar
Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in kernel methods - support vector learning. MIT Press, Cambridge (1998)
Google Scholar
John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proc. of the 11th Conf. on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Mateo (1995)
Google Scholar
Aha, D., Kibler, D.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)
Google Scholar
Witten, I.J., Frank, E.: Data mining: practical machine learning tools with java implementations. Morgan Kaufmann, San Francisco (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Ewha Womans University, Seoul, Korea
Min Su Lee & Seung Soo Park

Authors

Min Su Lee
View author publications
You can also search for this author in PubMed Google Scholar
Seung Soo Park
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dokuz Eylül University, lzmir, Turkey
Tatyana Yakhno
University of Vienna, Vienna, Austria
Erich J. Neuhold

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, M.S., Park, S.S. (2006). Comparative Analysis of Classification Methods for Protein Interaction Verification System. In: Yakhno, T., Neuhold, E.J. (eds) Advances in Information Systems. ADVIS 2006. Lecture Notes in Computer Science, vol 4243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11890393_24

Download citation

DOI: https://doi.org/10.1007/11890393_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46291-0
Online ISBN: 978-3-540-46292-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics