Skip to main content
Log in

Learning structure perception MLPs on graphs: a layer-wise graph knowledge distillation framework

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Graph neural networks (GNNs) are expressive in dealing with graph data. Because of the large storage requirements and the high computational complexity, it is difficult to deploy these cumbersome models in resource-constrained environments. As a representative model compression strategy, knowledge distillation (KD) is introduced into graph analysis research to address this problem. However, there are some crucial challenges in existing graph knowledge distillation algorithms, such as knowledge transfer effectiveness and student model designation. To address these problems, a new graph distillation model is proposed in this paper. Specifically, a layer-wise mapping strategy is designed to distill knowledge for training the student model, in which staged knowledge learned by intermediate layers of teacher GNNs is captured to form supervision signals. And, an adaptive weight mechanism is developed to evaluate the importance of the distilled knowledge. On this basis, a structure perception MLPs is constructed as the student model, which can capture prior information of the input graph from the perspectives of node feature and topology structure. In this way, the proposed model shares the prediction advantage of GNNs and the latency advantage of MLPs. Node classification experiments on five benchmark datasets demonstrate the validity and superiority of our model over baseline algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The relevant common experimental data mentioned in the paper can be accessed through the following link: https://github.com/BUPT-GAMMA/CPF/tree/master/data. We have strictly followed the rules for the use of these datasets, and have ensured that the data are legally available and used.

References

  1. Beyer L, Zhai X, Royer A et al (2022) Knowledge distillation: a good teacher is patient and consistent. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10925–10934

  2. Bruna J, Zaremba W, Szlam A et al (2014) Spectral networks and deep locally connected networks on graphs. In: Proceedings of the 2th international conference on learning representations

  3. Chen D, Mei JP, Zhang Y et al (2021) Cross-layer distillation with semantic calibration. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 7028–7036

  4. Chen H, Wang Y, Xu C et al (2020) Learning student networks via feature embedding. IEEE Trans Neural Netw Learn Syst 32(1):25–35

    Article  Google Scholar 

  5. Chen M, Wei Z, Huang Z et al (2020) Simple and deep graph convolutional networks. In: Proceedings of the international conference on machine learning, pp 1725–1735

  6. Choudhary T, Mishra V, Goswami A et al (2020) A comprehensive survey on model compression and acceleration. Artif Intell Rev 53(7):5113–5155

    Article  Google Scholar 

  7. Deng X, Zhang Z (2021) Graph-free knowledge distillation for graph neural networks. In: Proceedings of the 30th international joint conference on artificial intelligence, pp 2321–2327

  8. Dong S, Wang P, Abbas K (2021) A survey on deep learning and its applications. Comput Sci Rev 40(100379):21

    MathSciNet  Google Scholar 

  9. Dong Y, Zhang B, Yuan Y et al (2023) Reliant: fair knowledge distillation for graph neural networks. In: Proceedings of the 2023 SIAM international conference on data mining (SDM), society for industrial and applied mathematics, pp 154–162

  10. Furlanello T, Lipton Z, Tschannen M et al (2018) Born again neural networks. In: Proceedings of the international conference on machine learning, pp 1607–1616

  11. Gasteiger J, Bojchevski A, G¨unnemann S (2019) Predict then propagate: graph neural networks meet personalized pagerank. In: Proceedings of the international conference on learning representations

  12. Gou J, Yu B, Maybank SJ et al (2021) Knowledge distillation: a survey. Int J Comput Vis 129:1789–1819

    Article  Google Scholar 

  13. Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 30th neural information processing systems

  14. He H, Wang J, Zhang Z et al (2022) Compressing deep graph neural networks via adversarial knowledge distillation. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pp 534–544

  15. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. In: CoRR

  16. Huang Z, Tang Y, Chen Y (2022) A graph neural network-based node classification model on class-imbalanced graph data. Knowl-Based Syst 244

  17. Jia J, Benson AR (2022) A unifying generative model for graph learning algorithms: label propagation, graph convolutions, and combinations. SIAM J Math Data Sci 4:100–105

    Article  MathSciNet  Google Scholar 

  18. Joshi CK, Liu F, Xun X et al (2022) On representation knowledge distillation for graph neural networks. IEEE transactions on neural networks and learning systems, pp 1–12

  19. Kim J, Park NS, Kwak (2018) Paraphrasing complex network: network compression via factor transfer. In: Proceedings of the 31st conference on neural information processing systems, pp 2765–2774

  20. Kim J, Jung J, Kang U (2021) Compressing deep graph convolution network with multi-staged knowledge distillation. PLoS ONE 16

  21. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations 22

  22. Li Q, Peng H, Li J et al (2022) A survey on text classification: from shallow to deep learning. ACM Trans Intell Syst Technol 13(2):31:1-31:41

    Article  Google Scholar 

  23. Liu J, Zheng T, Hao Q (2022) Hire: distilling high-order relational knowledge from heterogeneous graph neural networks. Neurocomputing 507:67–83

    Article  Google Scholar 

  24. Ma X, Wu J, Xue S et al (2023) A comprehensive survey on graph anomaly detection with deep learning. IEEE Trans Knowl Data Eng 35(12):12012–12038

    Article  Google Scholar 

  25. Mirzadeh SI, Farajtabar M, Li A et al (2020) Improved knowledge distillation via teacher assistant. In: Proceedings of the 34th AAAI conference on artificial intelligence, pp 5191–5198

  26. Passalis N, Tzelepi M, Tefas A (2020) Probabilistic knowledge transfer for lightweight deep representation learning. IEEE Trans Neural Netw Learn Syst 32(5):2030–2039

    Article  MathSciNet  Google Scholar 

  27. Peng H, Wang H, Du B et al (2020) Spatial temporal incidence dynamic graph neural networks for traffic flow forecasting. Inf Sci 521:277–290

    Article  Google Scholar 

  28. Peng H, Du B, Liu M et al (2021) Dynamic graph convolutional network for long-term traffic flow prediction with reinforcement learning. Inf Sci 578:401–416

    Article  MathSciNet  Google Scholar 

  29. Romero A, Ballas N, Kahou SE et al (2014) Fitnets: hints for thin deep nets. In: Proceedings of the international conference on learning representations

  30. Shah SM, Lau VK (2023) Model compression for communication efficient federated learning. IEEE Trans Neural Netw Learn Syst 34(9):5937–5951

    Article  MathSciNet  Google Scholar 

  31. Tian Y, Zhang C, Guo Z et al (2023) Learning mlps on graphs: a unified view of effectiveness, robustness, and efficiency. In: Proceedings of the eleventh international conference on learning representations

  32. Veliˇckovi´c P, Cucurull G, Casanova A et al (2018) Graph attention networks. In: Proceedings of the international conference on learning representations

  33. Wu L, Cui P, Pei J et al (2022) Graph neural networks: foundation, frontiers and applications. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pp 4840–4841

  34. Wu L, Lin H, Huang Y et al (2022) Knowledge distillation improves graph structure augmentation for graph neural networks. Adv Neural Inf Process Syst 35(11815–11827):23

    Google Scholar 

  35. Wu S, Sun F, Zhang W et al (2022) Graph neural networks in recommender systems: a survey. ACM Comput Surv 55(57):1–37

    Google Scholar 

  36. Yan B, Wang C, Guo G et al (2020) Tinygnn: learning efficient graph neural networks. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1848–1856

  37. Yang C, Liu J, Shi C (2021) Extract the knowledge of graph neural networks and go beyond it: an effective knowledge distillation framework. Proc Web Conf 2021:1227–1237

    Google Scholar 

  38. Yang Y, Qiu J, Song M et al (2020) Distilling knowledge from graph convolutional networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7074–7083

  39. Yuan H, Yu H, Gui S et al (2022) Explainability in graph neural networks: a taxonomic survey. IEEE Trans Pattern Anal Mach Intell 45(5):5782–5799

    Google Scholar 

  40. Zhang S, Liu Y, Sun Y et al (2022) Graph-less neural networks: teaching old mlps new tricks via distillation. In: Proceedings of the 10th international conference on learning representations, pp 2321–2327

  41. Zhang W, Miao X, Shao Y et al (2020) Reliable data distillation on graph convolutional network. In: Proceedings of the 2020 ACM SIGMOD international conference on management of data, pp 1399–1414

  42. Zhang Z, Cui P, Zhu W (2020) Deep learning on graphs: a survey. IEEE Trans Knowl Data Eng 34(1):249–270

    Article  Google Scholar 

  43. Zhou J, Cui G, Hu S et al (2020) Graph neural networks: a review of methods and applications. AI open 1:57–81

    Article  Google Scholar 

  44. Zhuang F, Moulin P (2023) Deep semi-supervised metric learning with mixed label propagation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3429–3438

Download references

Acknowledgements

The authors are very grateful to reviewers and editors for their suggestions. This work is supported by the National Natural Science Foundation of China (U21A20513, 62076154, 62276159, 62276161, T2122020), the Key R\&D Program of Shanxi Province (202202020101003, 202302010101007), and the Fundamental Research Program of Shanxi Province (202303021221055).

Author information

Authors and Affiliations

Authors

Contributions

Author Contributions Statement: Hangyuan Du: Proposing the method, Designing Experiments, Revising-original draft, Editing. Rong Yu: Designing the model, Designing Experiments, Writing-original draft, Revising-original draft. Liang Bai: Improving the model, Revising-original draft. Lu Bai: Experiments analysis, Improving language. Wenjian Wang: Preparing tables and pictures, Editing.

Corresponding author

Correspondence to Hangyuan Du.

Ethics declarations

Competing interests

The authors declare no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Du, H., Yu, R., Bai, L. et al. Learning structure perception MLPs on graphs: a layer-wise graph knowledge distillation framework. Int. J. Mach. Learn. & Cyber. (2024). https://doi.org/10.1007/s13042-024-02150-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13042-024-02150-2

Keywords

Navigation