Collaboration-Based Function Prediction in Protein-Protein Interaction Networks
The cellular metabolism of a living organism is among the most complex systems that man is currently trying to understand. Part of it is described by so-called protein-protein interaction (PPI) networks, and much effort is spent on analyzing these networks. In particular, there has been much interest in predicting certain properties of nodes in the network (in this case, proteins) from the other information in the network. In this paper, we are concerned with predicting a protein’s functions. Many approaches to this problem exist. Among the approaches that predict a protein’s functions purely from its environment in the network, many are based on the assumption that neighboring proteins tend to have the same functions. In this work we generalize this assumption: we assume that certain neighboring proteins tend to have “collaborative”, but not necessarily the same, functions. We propose a few methods that work under this new assumption. These methods yield better results than those previously considered, with improvements in F-measure ranging from 3% to 17%. This shows that the commonly made assumption of homophily in the network (or “guilt by association”), while useful, is not necessarily the best one can make. The assumption of collaborativeness is a useful generalization of it; it is operational (one can easily define methods that rely on it) and can lead to better results.
Unable to display preview. Download preview PDF.
- 4.Guldener, U., Munsterkotter, M., Kastenmuller, G., Strack, N., van Helden, J., Lemer, C., et al.: Cygd: the comprehensive yeast genome database. Nucleic Acids Research 33(supplement. 1), D364+ (January 2005)Google Scholar
- 8.Milenkovic, T., Przulj, N.: Uncovering biological network function via graphlet degree signatures. Cancer Informatics 6, 257–273 (2008)Google Scholar
- 9.Rahmani, H., Blockeel, H., Bender, A.: Predicting the functions of proteins in PPI networks from global information. JMLR Proceeding 8, 82–97 (2010)Google Scholar