Skip to main content

A Hybrid Rule-Induction/Likelihood-Ratio Based Approach for Predicting Protein-Protein Interactions

  • Chapter
Computational Intelligence

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 1))

Abstract

We propose a new hybrid data mining method for predicting protein-protein interactions combining Likelihood-Ratio with rule induction algorithms. In essence, the new method consists of using a rule induction algorithm to discover rules representing partitions of the data, and then the discovered rules are interpreted as “bins” which are used to compute likelihood ratios. This new method is applied to the prediction of protein-protein interactions in the Saccharomyces Cerevisiae genome, using predictive genomic features in an integrated scheme. The results show that the new hybrid method outperforms a pure likelihood ratio based approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.: Molecular Biology of the Cell, 2nd edn. Garland, New York (1989)

    Google Scholar 

  2. Aloy, P., Russell, R.B.: Structural systems biology: modelling protein interactions. Nat. Rev. Mol. Cell. Biol. 7(3), 188–197 (2006)

    Article  Google Scholar 

  3. Bock, J.R., Gough, D.A.: Predicting protein-protein interactions from primary structure. Bioinformatics 17(5), 455–460 (2001)

    Article  Google Scholar 

  4. Bock, J.R., Gough, D.A.: Whole proteome interaction mining. Bioinformatics 19(1), 125–135 (2003)

    Article  Google Scholar 

  5. Browne, F., Asuaje, F., Wang, H., Zheng, H.: An assessment of machine and statistical learning approaches to inferring networks of protein-protein interactions. Journal of Integrative Bioinformatics 3(2) (2006)

    Google Scholar 

  6. Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  7. Eisenberg, D., Marcotte, E.M., Xenarios, I., Yeates, T.O.: Protein function in the post-genomic era. Nature 405(6788), 823–826 (2000)

    Article  Google Scholar 

  8. Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Fifteenth International Conference on Machine Learning. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  9. Freitas, A.A.: Data Mining and Knowldge Discovery with Evolutionary Algorithms. Springer, Heidelberg (2002)

    Google Scholar 

  10. Furnkranz, J.: Separate-and-conquer rule learning. Artificial Intelligence Review 13(1), 3–54 (1999)

    Article  Google Scholar 

  11. Galperin, M.Y., Koonin, E.V.: Whos your neighbor?New computational approaches for functional genomics. Nat. Biotechnol. 18, 609–613 (2000)

    Article  Google Scholar 

  12. Gavin, A.C., Bsche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J.M., Michon, A.M., Cruciat, C.M., Remor, M., Hfert, C., Schelder, M., Brajenovic, M., Ruffner, H., Merino, A., Klein, K., Hudak, M., Dickson, D., Rudi, T., Gnau, V., Bauch, A., Bastuck, S., Huhse, B., Leutwein, C., Heurtier, M.A., Copley, R.R., Edelmann, A., Querfurth, E., Rybin, V., Drewes, G., Raida, M., Bouwmeester, T., Bork, P., Seraphin, B., Kuster, B., Neubauer, G., Superti-Furga, G.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002)

    Article  Google Scholar 

  13. Ge, H., Liu, Z., Church, G.M., Vidal, M.: Correlation between transcriptome and interactome mapping data from Saccharomyces Cerevisiae. Nat. Genet. 29, 482–486 (2001)

    Article  Google Scholar 

  14. Giordana, A., Sale, C.: Learning structured concepts using genetic algorithms. In: Sleeman, D., Edwards, P. (eds.) Proceedings of the 9th International Workshop on Machine Learning, pp. 169–178 (1992)

    Google Scholar 

  15. Goh, C., Bogan, A.A., Joachimiak, M., Walther, D., Cohen, F.E.: Co-evolution of Proteins with their Interaction Partners. J. Mol. Biol. 299, 283–293 (2000)

    Article  Google Scholar 

  16. Goh, C., Cohen, F.E.: Co-evolutionary Analysis Reveals Insights into ProteinProtein Interactions. J. Mol. Biol. 324, 177–192 (2002)

    Article  Google Scholar 

  17. Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A., Taylor, P., Bennett, K., Boutilier, K., Yang, L., Wolting, C., Donaldson, I., Schandorff, S., Shewnarane, J., Vo, M., Taggart, J., Goudreault, M., Muskat, B., Alfarano, C., Dewar, D., Lin, Z., Michalickova, K., Willems, A.R., Sassi, H., Nielsen, P.A., Rasmussen, K.J., Andersen, J.R., Johansen, L.E., Hansen, L.H., Jespersen, H., Podtelejnikov, A., Nielsen, E., Crawford, J., Poulsen, V., Srensen, B.D., Matthiesen, J., Hendrickson, R.C., Gleeson, F., Pawson, T., Moran, M.F., Durocher, D., Mann, M., Hogue, C.W., Figeys, D., Tyers, M.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002)

    Article  Google Scholar 

  18. Iqbal, M., Freitas, A.A., Johnson, C.G.: Protein Interaction Inference Using Particle Swarm Optimization Algorithm. In: Marchiori, E., Moore, J.H. (eds.) EvoBIO 2008. LNCS, vol. 4973, pp. 61–70. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  19. Iqbal, M., Freitas, A.A., Johnson, C.G., Vergassola, M.: Message-Passing Algorithms for the Prediction of Protein Domain Interactions from Protein-Protein Interaction Data. Bioinformatics (2008), doi:10.1093/bioinformatics/btn366

    Google Scholar 

  20. Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two hybrid analysis to explore the yeast protein interactome. PNAS 98, 4569–4574 (2001)

    Article  Google Scholar 

  21. Jansen, R., Yu, H., Greenbaum, D., Kluger, Y., Krogan, N.J., Chung, S., Emili, A., Snyder, M., Greenblatt, J.F., Gerstein, M.: A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data. Science 302, 449–453 (2003)

    Article  Google Scholar 

  22. Lu, L.J., Xia, Y., Paccanaro, A., Yu, H., Gerstein, M.: Assessing the limits of genomic data integration for predicting protein networks. Genome Res. 15, 945–953 (2005)

    Article  Google Scholar 

  23. Marcotte, E.M., Pellegrini, M., Ng, H.L., Rice, D.W., Yeates, T.O., Eisenberg, D.: Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753 (1999)

    Article  Google Scholar 

  24. Mewes, H.W., Frishman, D., Gldener, U., Mannhaupt, G., Mayer, K., Mokrejs, M., Morgenstern, B., Mnsterktter, M., Rudd, S., Weil, B.: MIPS:a database for genomes and protein sequences. Nucleic Acids Res. 30, 31–34 (2002)

    Article  Google Scholar 

  25. Michalski, R.S.: On the quasi-minimal solution of the covering problem. In: Proceedings of the 5th International Symposium on Information Processing (FCIP 1969) (Switching Circuits), Bled, Yugoslavia, vol. A3, pp. 125–128 (1969)

    Google Scholar 

  26. Michalski, R.S.: AQVAL/1—Computer implementation of a variable-valued logic system VL 1 and examples of its application to pattern recognition. In: Proceedings of the First International Conference of Pattern Recognition, pp. 3–17 (1973)

    Google Scholar 

  27. Michalski, R.S., Mozetič, I., Hing, J., Lavrač, N.: The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. In: Proceedings of the Fifth National Conference on Artificial Intelligence, pp. 1041–1045 (1986)

    Google Scholar 

  28. Oti, M., Snel, B., Huynen, M.A., Brunner, H.G.: Predicting disease genes using protein-protein interactions. J. Med. Genet. 43, 691–698 (2006)

    Article  Google Scholar 

  29. Pagallo, G., Haussler, D.: Boolean feature discovery in empirical learning. Machine Learning 5, 71–99 (1990)

    Article  Google Scholar 

  30. Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  31. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  32. Quinlan, J.R., Cameron-Jones, R.M.: Induction of logic programs: FOIL and related systems. New Generation Computing 13(3-4), 287–312 (1995)

    Article  Google Scholar 

  33. Rhodes, D.R., Tomlins, S.A., Varambally, S., Mahavisno, V., Barrette, T., Kalyana-Sundaram, S., Ghosh, D., Pandey, A., Chinnaiyan, A.M.: Probabilistic model of the human protein-protein interaction network. Nature Biotechnology 23(8), 951–959 (2005)

    Article  Google Scholar 

  34. Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The Database of Interacting Proteins: 2004 update. NAR 32, D449–D451 (2004)

    Article  Google Scholar 

  35. Schaffer, C.: Overfitting avoidance as bias. Machine Learning 10, 145–154 (1993)

    Google Scholar 

  36. Shoemaker, B.A., Panchenko, A.R.: Deciphering ProteinProtein Interactions. Part-I: Experimental Techniques and Databases. PLoS Computational Biology 3(3), e42 (2007)

    Article  Google Scholar 

  37. Shoemaker, B.A., Panchenko, A.R.: Deciphering ProteinProtein Interactions. Part-II: Computational Methods to Predict Protein and Domain Interaction Partners. PLoS Computational Biology 3(4), e43 (2007)

    Article  Google Scholar 

  38. Thatcher, J.W., Shaw, J.M., Dickinson, W.J.: Marginal fitness contributions of non-essential genes in Yeast. PNAS 95, 253–257 (1998)

    Article  Google Scholar 

  39. Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., Rothberg, J.M.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403(1), 623–627 (2000)

    Google Scholar 

  40. Utgoff, P.E.: Shift of bias for inductive concept learning. In: Michalski, R., Carbonell, J., Mitchell, T. (eds.) Machine Learning: An Artificial Intelligence Approach, vol. II, pp. 107–148 (1986)

    Google Scholar 

  41. Valencia, A., Pazos, F.: Computational methods for the prediction of protein interactions. Current Opinion in Structural Biology 12, 368–373 (2002)

    Article  Google Scholar 

  42. von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417(6887), 399–403 (2002)

    Article  Google Scholar 

  43. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  44. Xenarios, I., Salwnski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: DIP: The Database of Interacting Proteins. A research tool for studying cellular networks of protein interactions. NAR 30, 303–305 (2002)

    Article  Google Scholar 

  45. Yamanishi, Y., Vert, J.P., Kanehisa, M.: Protein network inference from multiple genomic data: a supervised approach. Bioinformatics 20(suppl.1), i363–i370 (2004)

    Article  Google Scholar 

  46. Yook, S.H., Oltvai, Z.N., Barabsi, A.L.: Functional and topological characterization of protein interaction networks. Proteomics 4, 928–942 (2004)

    Article  Google Scholar 

  47. Yu, H., Greenbaum, D., Xin Lu, H., Zhu, X., Gerstein, M.: Genomic analysis of essentiality within protein networks. Trends Genet. 20, 227–231 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Iqbal, M., Freitas, A.A., Johnson, C.G. (2009). A Hybrid Rule-Induction/Likelihood-Ratio Based Approach for Predicting Protein-Protein Interactions. In: Mumford, C.L., Jain, L.C. (eds) Computational Intelligence. Intelligent Systems Reference Library, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01799-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01799-5_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01798-8

  • Online ISBN: 978-3-642-01799-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics