Abstract
Long-tailed methods have gained increasing attention and achieved excellent performance due to the long-tailed distribution in graphs, i.e., many small-degree tail nodes have limited structural connectivity. However, real-world graphs are inevitably noisy or incomplete due to error-prone data acquisition or perturbations, which may violate the assumption that the raw graph structure is ideal for long-tailed methods. To address this issue, we study the impact of graph perturbation on the performance of long-tailed methods, and propose a novel GNN-based framework called LTSL-GNN for graph structure learning and tail node embedding enhancement. LTSL-GNN iteratively learns the graph structure and tail node embedding enhancement parameters, allowing information-rich head nodes to optimize the graph structure through multi-metric learning and further enhancing the embeddings of the tail nodes with the learned graph structure. Experimental results on six real-world datasets demonstrate that LTSL-GNN outperforms other state-of-the-art baselines, especially when the graph structure is disturbed.
Similar content being viewed by others
Abbreviations
- GNN:
-
Graph Neural Network
- KNN:
-
K Nearest Neighbors
- LeakyReLU:
-
Leaky Rectified Linear Unit
- MLP:
-
Multi-Layer Perceptron
- LTSL-GNN:
-
Long-Tailed Node Classification via Graph Structure Learning
- Prep-GNN:
-
Preprocessing only in LTSL-GNN
- LTSL-w/o-MM:
-
LTSL-GNN without Multi-Metric
- LTSL-w/o-GSL:
-
LTSL-GNN without Graph Structure Learning
- LTSL-main:
-
LTSL-GNN without shift-GNN
References
Wu Z, Pan S, Chen F et al (2021) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24. https://doi.org/10.1109/TNNLS.2020.2978386
Abu-El-Haija S, Perozzi B, Kapoor A et al (2019) MixHop: higher-order graph convolutional architectures via sparsified neighborhood mixing. In: Proceedings of the 36th international conference on machine learning, vol 97. PMLR, pp 21–29. https://proceedings.mlr.press/v97/abu-el-haija19a.html
Wu J, He J, Xu J (2019) Demo-net: degree-specific graph neural networks for node and graph classification. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 406–415. https://doi.org/10.1145/3292500.3330950
Zhang M, Chen Y (2018) Link prediction based on graph neural networks. Ad Neural Inform Proc Syst 31. https://proceedings.neurips.cc/paper/2018/file/53f0d7c537d99b3824f0f99d62ea2428-Paper.pdf
Song W, Xiao Z, Wang Y et al (2019) Session-based social recommendation via dynamic graph attention networks. In: Proceedings of the twelfth acm international conference on web search and data mining, association for computing machinery, WSDM’19, New York, pp 555–563. https://doi.org/10.1145/3289600.3290989
Ying R, He R, Chen K et al (2018) Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. association for computing machinery, KDD ’18, New York pp 974–983. https://doi.org/10.1145/3219819.3219890
Fan W, Ma Y, Li Q et al (2019) Graph neural networks for social recommendation. In: The World Wide Web conference. Association for Computing Machinery, WWW ’19, New York, pp 417–426. https://doi.org/10.1145/3308558.3313488
Park D, Song H, Kim M et al (2020) Trap: two-level regularized autoencoder-based embedding for power-law distributed data. In: Proceedings of The Web conference 2020. Association for Computing Machinery, New York, NY, USA, WWW ’20, p 1615–1624. https://doi.org/10.1145/3366423.3380233
Liu Z, Zhang W, Fang Y et al (2020) Towards locality-aware meta-learning of tail node embeddings on networks. In: Proceedings of the 29th ACM international conference on information & knowledge management. Association for Computing Machinery, New York, NY, USA, CIKM ’20, pp 975–984. https://doi.org/10.1145/3340531.3411910
Liu Z, Nguyen TK, Fang Y (2021) Tail-gnn: tail-node graph neural networks. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, KDD ’21, pp 1109–1119. https://doi.org/10.1145/3447548.3467276
Chen Y, Wu L, Zaki M (2020) Iterative deep graph learning for graph neural networks: Better and robust node embeddings. In: Larochelle H, Ranzato M, Hadsell R et al (eds) Advances in neural information processing systems, vol 33. Curran Associates, Inc., pp 19,314–19,326. https://proceedings.neurips.cc/paper/2020/file/e05c7ba4e087beea9410929698dc41a6-Paper.pdf
Dai H, Li H, Tian T et al (2018) Adversarial attack on graph structured data. In: Dy J, Krause A (eds) Proceedings of the 35th international conference on machine learning, proceedings of machine learning research, vol 80. PMLR, pp 1115–1124. https://proceedings.mlr.press/v80/dai18b.html
Jin W, Li Y, Xu H et al (2021) Adversarial attacks and defenses on graphs. SIGKDD Explor Newsl 22(2):19–34. https://doi.org/10.1145/3447556.3447566
Ren K, Zheng T, Qin Z et al (2020) Adversarial attacks and defenses in deep learning. Engineering 6(3):346–360. https://doi.org/10.1016/j.eng.2019.12.012. https://www.sciencedirect.com/science/article/pii/S209580991930503X
Zügner D, Akbarnejad A, Günnemann S (2018) Adversarial attacks on neural networks for graph data. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, KDD ’18, pp 2847–2856. https://doi.org/10.1145/3219819.3220078
Lin X, Zhou C, Wu J et al (2023) Exploratory adversarial attacks on graph neural networks for semi-supervised node classification. Pattern Recog 133:109,042. https://doi.org/10.1016/j.patcog.2022.109042. https://www.sciencedirect.com/science/article/pii/S0031320322005222
Jin W, Ma Y, Liu X et al (2020) Graph structure learning for robust graph neural networks. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, KDD ’20, pp 66–74. https://doi.org/10.1145/3394486.3403049
Zhao J, Wang X, Shi C et al (2021) Heterogeneous graph structure learning for graph neural networks. Proc AAAI Conf Artif Intell 35(5):4697–4705. https://doi.org/10.1609/aaai.v35i5.16600. https://ojs.aaai.org/index.php/AAAI/article/view/16600
Luo D, Cheng W, Yu W et al (2021) Learning to drop: robust graph neural network via topological denoising. In: Proceedings of the 14th ACM international conference on web search and data mining. Association for Computing Machinery, New York, NY, USA, WSDM ’21, p 779–787. https://doi.org/10.1145/3437963.3441734
Franceschi L, Niepert M, Pontil M et al (2019) Learning discrete structures for graph neural networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning, proceedings of machine learning research, vol 97. PMLR, pp 1972–1982. https://proceedings.mlr.press/v97/franceschi19a.html
Jiang B, Zhang Z, Lin D, et al. (2019) Semi-supervised learning with graph learning-convolutional networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Chen F, Wang YC, Wang B et al (2020) Graph representation learning: a survey. APSIPA Trans Signal Inf Process 9:e15. https://doi.org/10.1017/ATSIP.2020.13
Xia F, Sun K, Yu S, et al. (2021) Graph learning: a survey. IEEE Trans Artif Intell 2 (2):109–127. https://doi.org/10.1109/TAI.2021.3076021
Bruna J, Zaremba W, Szlam A, et al. (2014) Spectral networks and locally connected networks on graphs. In: International conference on learning representations (ICLR 2014), CBLS 2014
Defferrard M, Bresson X, Vandergheynst P (2016). In: Advances in neural information processing systems, pp 3844–3852. https://proceedings.neurips.cc/paper/2016/file/04df4d434d481c5bb723be1b6df1ee65-Paper.pdf
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, conference track proceedings. OpenReview.net. https://doi.org/10.48550/arXiv.1609.02907. https://openreview.net/forum?id=SJU4ayYgl
Wu F, Souza A, Zhang T et al (2019) Simplifying graph convolutional networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning, proceedings of machine learning research, vol 97. PMLR, pp 6861–6871. https://proceedings.mlr.press/v97/wu19e.html
Monti F, Boscaini D, Masci J et al (2017) Geometric deep learning on graphs and manifolds using mixture model cnns. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5115–5124
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Guyon I, Luxburg UV, Bengio S et al (eds) Advances in neural information processing systems, vol 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/5dd9db5e033da9c6fb5ba83c7a7ebea9-Paper.pdf
Veličković P, Cucurull G, Casanova A et al (2018) Graph attention networks. International conference on learning representations. https://openreview.net/forum?id=rJXMpikCZ
Zhuang C, Ma Q (2018) Dual graph convolutional networks for graph-based semi-supervised classification. In: WWW ’18: Proceedings of the 2018 World Wide Web conference. International World Wide Web conferences steering committee, Republic and Canton of Geneva, CHE, pp 499-508, DOI https://doi.org/10.1145/3178876.3186116
Schlichtkrull M, Kipf TN, Bloem P, et al., et al. (2018) Modeling relational data with graph convolutional networks. In: Gangemi A, Navigli R, Vidal ME (eds) The Semantic Web. Springer International Publishing Cham, pp 593-607
Feng W, Zhang J, Dong Y et al (2020) Graph random neural networks for semi-supervised learning on graphs. In: Larochelle H, Ranzato M, Hadsell R et al (eds) Advances in neural information processing systems, vol 33. Curran Associates, Inc, pp 22,092–22,103. https://proceedings.neurips.cc/paper/2020/file/fb4c835feb0a65cc39739320d7a51c02-Paper.pdf
Liu Z, Zhang W, Fang Y et al (2020) Towards locality-aware meta-learning of tail node embeddings on networks. In: Proceedings of the 29th ACM international conference on information & knowledge management. Association for Computing Machinery, New York, NY, USA, CIKM ’20, pp 975–984. https://doi.org/10.1145/3340531.3411910
Niu X, Li B, Li C et al (2020) A dual heterogeneous graph attention network to improve long-tail performance for shop search in e-commerce. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, KDD ’20, pp 3405–3415. https://doi.org/10.1145/3394486.3403393
Yun S, Kim K, Yoon K et al (2022) Lte4g: long-tail experts for graph neural networks. In: Proceedings of the 31st ACM international conference on information & knowledge management. Association for Computing Machinery, New York, NY, USA, CIKM ’22, pp 2434–2443. https://doi.org/10.1145/3511808.3557381
Liu Z, Mao Q, Liu C et al (2022) On size-oriented long-tailed graph classification of graph neural networks. In: Proceedings of the ACM web conference 2022. Association for Computing Machinery, New York, NY, USA, WWW ’22, pp 1506–1516 . https://doi.org/10.1145/3485447.3512197
Zhang Y, Pal S, Coates M et al (2019) Bayesian graph convolutional neural networks for semi-supervised classification. Proc AAAI Conf Artif Intell 33(01):5829–5836. https://doi.org/10.1609/aaai.v33i01.33015829. https://ojs.aaai.org/index.php/AAAI/article/view/4531
Wang X, Zhu M, Bo D et al (2020) Am-gcn: adaptive multi-channel graph convolutional networks. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, KDD ’20, pp 1243–1253. https://doi.org/10.1145/3394486.3403177
Zhao J, Dong Y, Ding M et al (2021) Adaptive diffusion in graph neural networks. In: Ranzato M, Beygelzimer A, Dauphin Y et al (eds) Advances in neural information processing systems, vol 34. Curran Associates, Inc, pp 23,321–23,333. https://proceedings.neurips.cc/paper/2021/file/c42af2fa7356818e0389593714f59b52-Paper.pdf
Liu Y, Zheng Y, Zhang D et al (2022) Towards unsupervised deep graph structure learning. In: Proceedings of the ACM web conference 2022. Association for Computing Machinery, New York, NY, USA, WWW ’22, pp 1392–1403 . https://doi.org/10.1145/3485447.3512186
Zhang R, Nie F, Wang Y et al (2019) Unsupervised feature selection via adaptive multimeasure fusion. IEEE Trans Neural Netw Learn Syst 30(9):2886–2892. 10.1109/TNNLS.2018.2884487
Yun S, Jeong M, Kim R et al (2019) Graph transformer networks. In: Wallach H, Larochelle H, Beygelzimer A et al (eds) Advances in neural information processing systems, vol 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/file/9d63484abb477c97640154d40595a3bb-Paper.pdf
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, conference track proceedings . https://doi.org/10.48550/arXiv.1412.6980
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, KDD ’14, pp 701–710. https://doi.org/10.1145/2623330.2623732
Gove R, Cadalzo L, Leiby N et al (2022) New guidance for using t-sne: alternative defaults, hyperparameter selection automation, and comparative evaluation. Vis Informat 6(2):87–97. https://doi.org/10.1016/j.visinf.2022.04.003. https://www.sciencedirect.com/science/article/pii/S2468502X22000201
Acknowledgements
This work is supported by the Fundamental Research Funds for the Central Universities, China under Grant 2021III030JC.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix : A: Supplement
Appendix : A: Supplement
In the supplement, we provide the websites of baselines and datasets of all experiments involved in this paper. For reproducibility, we also provide the specific experimental environment and the hyperparametric values of all experiments.
1.1 A.1 Baselines
The baselines compared in this paper can be found as the following URLs:
-
Deepwalk:https://github.com/phanein/deepwalk/
-
DEMO-Net:https://github.com/PetarV-/GAT/
-
meta-tail2vec:https://github.com/smufang/meta-tail2vec
1.2 A.2 Datasets
The datasets used in this paper can be found as the following URLs:
-
Citeseer:Footnote 1It is a citation network of research papers, which are divided into six categories. The feature of each node in the dataset is a word vector that describes whether the paper has the corresponding words.
-
CoraFull:Footnote 2 It is a famous citation network based on paper topic, which contains 19793 scientific publications and is divided into 70 categories. The nodes in the dataset represent papers and edges represent citations.
-
Squirrel:Footnote 3 It is a page-page network on specific topics in Wikipedia, in which each node represents web page and edge denote the mutual link between pages.
-
Actor:Footnote 4 It is an actor co-occurrence network, in which each node correspond to an actor, and each edge denotes the co-occurrence of two nodes on the same Wikipedia page. Node features correspond to bag-of-word vectors, and the actors are classified into five categories in term of words of actors’ Wikipedia.
-
BlogCatalog:Footnote 5 This dataset is a social relationship network. The graph is composed of bloggers and their social relationships. Node attributes consist of keywords in users profile. The labels represent bloggers’ interests and are divided into six categories.
-
Flickr:Footnote 6 It is a graphic social network, in which nodes represent users, and edges correspond to the association between users. The labels represent users’ interests, and all nodes are divided into 9 categories.
1.3 A.3 Implementation Details
The codes of LTSL-GNN are based on PyTorch Geometric. For reproducibility, the code and and model hyperparameters will be public when the paper publishes.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, J., Wan, Y., Xu, J. et al. Long-tailed graph neural networks via graph structure learning for node classification. Appl Intell 53, 20206–20222 (2023). https://doi.org/10.1007/s10489-023-04534-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04534-3