Protein Complex Similarity Based on Weisfeiler-Lehman Labeling
Proteins in living cells rarely act alone, but instead perform their functions together with other proteins in so-called protein complexes. Being able to quantify the similarity between two protein complexes is essential for numerous applications, e.g. for database searches of complexes that are similar to a given input complex. While the similarity problem has been extensively studied on single proteins and protein families, there is very little existing work on modeling and computing the similarity between protein complexes. Because protein complexes can be naturally modeled as graphs, in principle general graph similarity measures may be used, but these are often computationally hard to obtain and do not take typical properties of protein complexes into account. Here we propose a parametric family of similarity measures based on Weisfeiler-Lehman labeling. We evaluate it on simulated complexes of the extended human integrin adhesome network. We show that the defined family of similarity measures is in good agreement with edit similarity, a similarity measure derived from graph edit distance, but can be computed more efficiently. It can therefore be used in large-scale studies and serve as a basis for further refinements of modeling protein complex similarity.
KeywordsSimilarity measure Protein complexes Weisfeiler-Lehman labeling Constrained protein interaction networks Jaccard similarity
This work was supported by Deutsche Forschungsgemeinschaft (DFG) Collaborative Research Center (SFB) 876, projects A6 and C1, and by Mercator Research Center Ruhr (MERCUR), project Pe-2013-0012 (UA Ruhr professorship). The authors thank Eli Zamir for insightful discussions.
- 2.Babai, L., Kucera, L.: Canonical labelling of graphs in linear average time. In: 20th Annual Symposium on Foundations of Computer Science (SFCS), pp. 39–46. IEEE (1979)Google Scholar
- 17.Pearson, W.R.: Selecting the right similarity-scoring matrix. Curr. Protoc. Bioinform. 43, 1–9 (2013)Google Scholar
- 19.Riesen, K., Ferrer, M., Bunke, H.: Approximate graph edit distance in quadratic time. IEEE/ACM Trans. Comput. Biol. Bioinform. (2015) (epub ahead of print)Google Scholar
- 28.Weisfeiler, B., Lehman, A.A.: A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsiya 2(9), 12–16 (1968). (in Russian)Google Scholar