Measuring in-network node similarity based on neighborhoods: a unified parametric approach

Yang, Yu; Pei, Jian; Al-Barakati, Abdullah

doi:10.1007/s10115-017-1033-5

Measuring in-network node similarity based on neighborhoods: a unified parametric approach

Regular Paper
Published: 17 February 2017

Volume 53, pages 43–70, (2017)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

740 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

In many applications, we need to measure similarity between nodes in a large network based on features of their neighborhoods. Although in-network node similarity based on proximity has been well investigated, surprisingly, measuring in-network node similarity based on neighborhoods remains a largely untouched problem in literature. One challenge is that in different applications we may need different measurements that manifest different meanings of similarity. Furthermore, we often want to make trade-offs between specificity of neighborhood matching and efficiency. In this paper, we investigate the problem in a principled and systematic manner. We develop a unified parametric model and a series of four instance measures. Those instance similarity measures not only address a spectrum of various meanings of similarity, but also present a series of trade-offs between computational cost and strictness of matching between neighborhoods of nodes being compared. By extensive experiments and case studies, we demonstrate the effectiveness of the proposed model and its instances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantification of network structural dissimilarities

Article Open access 09 January 2017

Distances on a Graph

Comparison of large networks with sub-sampling strategies

Article Open access 06 July 2016

Notes

http://www.iam.unibe.ch/fki/databases/iam-graph-database.
The code is available at http://web.eecs.umich.edu/ dkoutra/CODE/fabp.zip(FaBP) [17]. Since FaBP is for binary classification and generates a belief of being positive for every node, we ran FaBP for each label in the dataset and the label of an unlabeled node is the label that has the highest belief value.

References

Borgatti SP, Everett MG (1993) Two algorithms for computing regular equivalence. Soc. Netw. 15(4):361–376
Article Google Scholar
Chein M, Mugnier M-L (2008) Graph-based knowledge representation: computational foundations of conceptual graphs. Springer Science & Business Media, Berlin
MATH Google Scholar
Deza MM, Deza E (2009) Encyclopedia of distances. Springer, New York
Book MATH Google Scholar
Fei H, Huan J (2008) Structure feature selection for graph classification. In: Proceedings of the 17th ACM conference on information and knowledge management, pp 991–1000. ACM
Gärtner T, Flach P, Wrobel S (2003) On graph kernels: hardness results and efficient alternatives. In: Schölkopf B, Warmuth M. (eds) Proceedings of the sixteenth annual conference on computational learning theory and the seventh annual workshop on kernel machines. Lecture notes in computer science, vol 2777. Springer, Heidelberg, pp 129–143
Gilpin S, Eliassi-Rad T, Davidson I (2013) Guided learning for role discovery (glrd): framework, algorithms, and applications. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 113–121. ACM
Gregson RAM (1975) Psychometrics of similarity. Academic, New York
Google Scholar
Han J, Wen J-R (2013) Mining frequent neighborhood patterns in a large labeled graph. In: Proceedings of the 22nd ACM international conference on Conference on information and knowledge management, pp 259–268. ACM
Han J, Wen J-R, Pei J (2014) Within-network classification using radius-constrained neighborhood patterns. In: Proceedings of the 23rd ACM international conference on Conference on information and knowledge management. ACM
Henderson K, Gallagher B, Eliassi-Rad T, Tong H, Basu S, Akoglu L, Koutra D, Faloutsos C, Li L (2012) Rolx: structural role extraction & mining in large graphs. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1231–1239. ACM
Henderson K, Gallagher B, Li L, Akoglu L, Eliassi-Rad T, Tong H, Faloutsos C (2011) It’s who you know: graph mining using recursive structural features. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 663–671. ACM
Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 538–543. ACM
Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web, pp 271–279. ACM
Jin R, Lee V.E, Hong H (2011) Axiomatic ranking of network role similarity. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 922–930. ACM
Kashima H, Tsuda K, Inokuchi A (2003) Marginalized kernels between labeled graphs. ICML 3:321–328
Google Scholar
Kleinberg J (2000) The small-world phenomenon: An algorithmic perspective. In: Proceedings of the thirty-second annual ACM symposium on theory of computing, pp 163–170. ACM
Koutra D, Ke T-Y, Kang U, Chau DH, Pao H-KK, Faloutsos C (2011) Unifying guilt-by-association approaches: theorems and fast algorithms. In: Proceedings of the European conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD), Greece, Athens, pp 245–260
Leskovec J, Chakrabarti D, Kleinberg J, Faloutsos C, Ghahramani Z (2010) Kronecker graphs: an approach to modeling networks. J Mach Learn Res 11:985–1042
MathSciNet MATH Google Scholar
Lorrain F, White HC (1971) Structural equivalence of individuals in social networks. J Math Sociol 1(1):49–80
Article Google Scholar
Newman ME (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104
Article MathSciNet Google Scholar
Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856
Google Scholar
Shervashidze N, Petri T, Mehlhorn K, Borgwardt KM, Vishwanathan S (2009) Efficient graphlet kernels for large graph comparison. In: International conference on artificial intelligence and statistics, pp 488–495
Shervashidze N, Schweitzer P, Van Leeuwen EJ, Mehlhorn K, Borgwardt KM (2011) Weisfeiler-lehman graph kernels. J Mach Learn Res 12:2539–2561
MathSciNet MATH Google Scholar
Sparrow MK (1993) A linear algorithm for computing automorphic equivalence classes: the numerical signatures approach. Soc Netw 15(2):151–170
Article MathSciNet Google Scholar
Sun Y, Han J, Yan X, Yu PS, Wu T (2011) Pathsim: meta path-based top-k similarity search in heterogeneous information networks. Proc VLDB Endow 4(11):992–1003
Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 797–806. ACM
Tong H, Faloutsos C, Pan J-Y (2006) Fast random walk with restart and its applications. In: Proceedings of the Sixth International Conference on Data Mining, pp 613–622. IEEE
Yedidia JS, Freeman WT, Weiss Y (2005) Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans Inf Theory 51(7):2282–2312
Article MathSciNet MATH Google Scholar
Yu W, Lin X, Zhang W, Chang L, Pei J (2013) More is simpler: Effectively and efficiently assessing node-pair similarities based on hyperlinks. Proc VLDB Endow 7(1):13–24
Article Google Scholar

Download references

Author information

Authors and Affiliations

Simon Fraser University, Burnaby, BC, Canada
Yu Yang & Jian Pei
King Abdulaziz University, Jeddah, Saudi Arabia
Abdullah Al-Barakati

Authors

Yu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Pei
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah Al-Barakati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Pei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Y., Pei, J. & Al-Barakati, A. Measuring in-network node similarity based on neighborhoods: a unified parametric approach. Knowl Inf Syst 53, 43–70 (2017). https://doi.org/10.1007/s10115-017-1033-5

Download citation

Received: 15 February 2016
Accepted: 07 February 2017
Published: 17 February 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10115-017-1033-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Measuring in-network node similarity based on neighborhoods: a unified parametric approach

Abstract

Access this article

Similar content being viewed by others

Quantification of network structural dissimilarities

Distances on a Graph

Comparison of large networks with sub-sampling strategies

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Measuring in-network node similarity based on neighborhoods: a unified parametric approach

Abstract

Access this article

Similar content being viewed by others

Quantification of network structural dissimilarities

Distances on a Graph

Comparison of large networks with sub-sampling strategies

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation