Abstract
Given a large graph with few node labels, how can we (a) identify whether there is generalized network-effects (GNE) or not, (b) estimate GNE to explain the interrelations among node classes, and (c) exploit GNE efficiently to improve the performance on downstream tasks? The knowledge of GNE is valuable for various tasks like node classification and targeted advertising. However, identifying GNE such as homophily, heterophily or their combination is challenging in real-world graphs due to limited availability of node labels and noisy edges. We propose NetEffect, a graph mining approach to address the above issues, enjoying the following properties: (i) Principled: a statistical test to determine the presence of GNE in a graph with few node labels; (ii) General and Explainable: a closed-form solution to estimate the specific type of GNE observed; and (iii) Accurate and Scalable: the integration of GNE for accurate and fast node classification. Applied on real-world graphs, NetEffect discovers the unexpected absence of GNE in numerous graphs, which were recognized to exhibit heterophily. Further, we show that incorporating GNE is effective on node classification. On a million-scale real-world graph, NetEffect achieves over 7\(\mathbf {\times }\) speedup (14 minutes vs. 2 hours) compared to most competitors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abu-El-Haija, S., Et al. : Mixhop: higher-order graph convolutional architectures via sparsified neighborhood mixing. In: ICML, pp. 21–29 (2019)
Alon, N., Benjamini, I., Lubetzky, E., Sodin, S.: Non-backtracking random walks mix faster. Commun. Contemp. Math. 9(04), 585–603 (2007)
Chien, E., Peng, J., Li, P., Milenkovic, O.: Adaptive universal generalized pagerank graph neural network. In: ICLR (2021)
Eswaran, D., Kumar, S., Faloutsos, C.: Higher-order label homogeneity and spreading in graphs. In: The Web Conference, pp. 2493–2499 (2020)
Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch Geometric. In: ICLR Workshop on Representation Learning on Graphs and Manifolds (2019)
Gatterbauer, W., Günnemann, S., Koutra, D., Faloutsos, C.: Linearized and single-pass belief propagation. PVLDB 8(5), 581–592 (2015)
Ghosh, A., Monsivais, D., Bhattacharya, K., Dunbar, R.I., Kaski, K.: Quantifying gender preferences in human social interactions using a large cellphone dataset. EPJ Data Sci. 8(1), 9 (2019)
Hu, W., et al.: Open graph benchmark: datasets for machine learning on graphs. NeurIPS 33, 22118–22133 (2020)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Klicpera, J., Bojchevski, A., Günnemann, S.: Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 (2018)
Koutra, D., Ke, T.-Y., Kang, U., Chau, D.H., Pao, H.-K.K., Faloutsos, C.: Unifying guilt-by-association approaches: theorems and fast algorithms. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 245–260. Springer, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23783-6_16
Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 177–187. Association for Computing Machinery, New York (2005)
Lim, D., Benson, A.R.: Expertise and dynamics within crowdsourced musical knowledge curation: A case study of the genius platform. arXiv preprint arXiv:2006.08108 (2020)
Lim, D., et al.: Large scale learning on non-homophilous graphs: New benchmarks and strong simple methods. NeurIPS 34, 20887–20902 (2021)
Lin, M., Lucas, H.C., Jr., Shmueli, G.: Research commentary-too big to fail: large samples and the p-value problem. Inf. Syst. Res. 24(4), 906–917 (2013)
Ma, Y., Liu, X., Shah, N., Tang, J.: Is homophily a necessity for graph neural networks? In: ICLR (2022)
Rozemberczki, B., Allen, C., Sarkar, R.: Multi-scale attributed node embedding (2019)
Rozemberczki, B., Sarkar, R.: Twitch gamers: a dataset for evaluating proximity preserving and structural role-based node embeddings. arXiv preprint arXiv:2101.03091 (2021)
Shepard, R.N.: Toward a universal law of generalization for psychological science. Sci. 237(4820), 1317–1323 (1987)
Takac, L., Zabovsky, M.: Data analysis in public social networks. In: International Scientific Conference and International Workshop Present Day Trends of Innovations. vol. 1 (2012)
Traud, A.L., Mucha, P.J., Porter, M.A.: Social structure of facebook networks. Phys. A 391(16), 4165–4180 (2012)
Wang, K., Shen, Z., Huang, C., Wu, C.H., Dong, Y., Kanakia, A.: Microsoft academic graph: when experts are not enough. Quant. Sci. Stud. 1(1), 396–413 (2020)
Wasserman, L., Ramdas, A., Balakrishnan, S.: Universal inference. Proc. Natl. Acad. Sci. 117(29), 16880–16890 (2020)
Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., Weinberger, K.: Simplifying graph convolutional networks. In: ICML, pp. 6861–6871. PMLR (2019)
Zhu, J., Yan, Y., Zhao, L., Heimann, M., Akoglu, L., Koutra, D.: Large scale learning on non-homophilous graphs: New benchmarks and strong simple methods. NeurIPS 34, 7793–7804 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lee, MC., Shekhar, S., Yoo, J., Faloutsos, C. (2024). NETEFFECT: Discovery and Exploitation of Generalized Network Effects. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14645. Springer, Singapore. https://doi.org/10.1007/978-981-97-2242-6_24
Download citation
DOI: https://doi.org/10.1007/978-981-97-2242-6_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2241-9
Online ISBN: 978-981-97-2242-6
eBook Packages: Computer ScienceComputer Science (R0)