Co-occurrence statistics-based global and local feature learning for graph networks

Ye, Fan

doi:10.1007/s00500-023-08665-0

Co-occurrence statistics-based global and local feature learning for graph networks

Data analytics and machine learning
Published: 13 June 2023

Volume 27, pages 11319–11328, (2023)
Cite this article

Soft Computing Aims and scope Submit manuscript

Fan Ye ORCID: orcid.org/0000-0002-1652-3223¹

113 Accesses
Explore all metrics

Abstract

Prediction tasks over the nodes and edges of the real-world network structure should learn features used by us, which is useful in many tasks such as node classification, link prediction and so on. Recent research in such field of representation learning has significant progress of deep neural networks combined with a different network walking method. However, present feature learning approaches do not pay much attention to get enough global information combined with getting the diversity of connectivity patterns in the network. An algorithmic framework called as Global2vec and Global2vec-PMI of learning vector feature representations for nodes in networks is proposed. With such network embedding method, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of the network neighborhoods information as well as the global statistics information of nodes. The global and local statistics of the nodes are used for modeling the loss function. We demonstrate the efficiency of our methods over existing state-of-the-art techniques for multi-label classification, link prediction in several different real-world networks. The proposed Global2vec can outperform the compared methods in all cases, and Global2vec-PMI outperforms others in most cases of BlogCatalog, PPI and Flickr dataset with Micro-F1 and Macro-F1 score for multi-label classification task. For link prediction task, generally, Global2vec-PMI is better when using Euclidean and Manhattan distance; for other distance metrics, Global2vec can achieve better performance. The maximum area under curve scores of all distance metrics is mostly obtained by the proposed global co-occurrence statistics-based methods. In conclusion, our work represents a very efficient way for learning vector representations of different network structures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An effective representation learning model for link prediction in heterogeneous information networks

Article 28 November 2023

Efficient Network Representations Learning: An Edge-Centric Perspective

Property graph representation learning for node classification

Article Open access 24 August 2023

Data availability

Enquiries about data availability should be directed to the authors.

References

Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Article MATH Google Scholar
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Article Google Scholar
Bonner MF, Epstein RA (2021) Object representations in the human brain reflect the co-occurrence statistics of vision and language. Nat Commun 12(1):4081
Article Google Scholar
Cao S, Lu W, Xu Q (2015) Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, ACM, pp 891–900
Forcen JI, Pagola M, Barrenechea E, Bustince H (2020) Co-occurrence of deep convolutional features for image search. Image Vis Comput 97:103909
Article Google Scholar
Gallagher B, Eliassi-Rad T (2010) Leveraging label-independent features for classification in sparsely labeled networks: An empirical study. In: Advances in social network mining and analysis, Springer, pp 1–19
Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl-Based Syst 151:78–94
Article Google Scholar
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 855–864
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1024–1034
Henderson K, Gallagher B, Eliassi-Rad T, Tong H, Basu S, Akoglu L, Koutra D, Faloutsos C, Li L (2012) Rolx: structural role extraction & mining in large graphs. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 1231–1239
Henderson K, Gallagher B, Li L, Akoglu L, Eliassi-Rad T, Tong H, Faloutsos C (2011) It’s who you know: graph mining using recursive structural features. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 663–671
Kim DJ, Sun X, Choi J, Lin S, Kweon IS (2020) Detecting human-object interactions with action co-occurrence priors. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, Springer, pp 718–736
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Transact. Knowl. Discov Data (TKDD) 1(1):2
Article Google Scholar
Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. Adv Neural Inf Process Syst 27:2177–2185
Google Scholar
Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP (2011) Molecular signatures database (msigdb) 3.0. Bioinformatics 27(12):1739–1740
Article Google Scholar
Masoumi N, Khajavi R (2023) A fuzzy classifier for evaluation of research topics by using keyword co-occurrence network and sponsors information. Scientometrics pp 1–28
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 701–710
Sheikh N, Kefato Z, Montresor A (2019) gat2vec. Computing
Stark C, Breitkreutz BJ, Chatr-Aryamontri A, Boucher L, Oughtred R, Livstone MS, Nixon J, Van Auken K, Wang X, Shi X (2010) The biogrid interaction database: 2011 update. Nucleic Acids Res 39(suppl–1):D698–D704
Google Scholar
Tang L, Liu H (2009) Relational learning via latent social dimensions. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 817–826
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on World Wide Web, International World Wide Web conferences steering committee, pp 1067–1077
Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous. Min. (IJDWM) 3(3):1–13
Article Google Scholar
Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903 1(2)
Wang P, Agarwal K, Ham C, Choudhury S, Reddy CK (2021) Self-supervised learning of contextual embeddings for link prediction in heterogeneous networks. Proc. Web Conf. 2021:2946–2957
Google Scholar
Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 1225–1234
Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) Simplifying graph convolutional networks. In: International conference on machine learning, PMLR, pp 6861–6871
Xue G, Zhong M, Li J, Chen J, Zhai C, Kong R (2022) Dynamic network embedding survey. Neurocomputing 472:212–223
Article Google Scholar
Yang J, Leskovec J (2014) Overlapping communities explain core-periphery organization of networks. Proc IEEE 102(12):1892–1902
Article Google Scholar
Yang C, Xiao Y, Zhang Y, Sun Y, Han J (2020) Heterogeneous network representation learning: a unified framework with survey and benchmark
Zafarani R, Liu H (2009) Social computing data repository at asu
Zhang Y, Gao S, Pei J, Huang H (2022) Improving social network embedding via new second-order continuous graph neural networks. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pp 2515–2523
Zhao Z, Zhou H, Li C, Tang J, Zeng Q (2021) Deepemlan: deep embedding learning for attributed networks. Inf Sci 543:382–397
Article MathSciNet Google Scholar

Download references

Acknowledgements

Fan Ye was supported by the Natural Science Foundation of Anhui Province of China (under grant 1908085MF187) and Key Natural Science Fund of Department of Education of Anhui Province of China (under grant KJ2018A0011).

Funding

This work was supported by the Natural Science Foundation of Anhui Province of China (under grant 1908085MF187) and Key Natural Science Fund of Department of Education of Anhui Province of China (under grant KJ2018A0011).

Author information

Authors and Affiliations

School of Computer Science and Technology, Anhui University, Hefei, 230601, China
Fan Ye

Authors

Fan Ye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fan Ye.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ye, F. Co-occurrence statistics-based global and local feature learning for graph networks. Soft Comput 27, 11319–11328 (2023). https://doi.org/10.1007/s00500-023-08665-0

Download citation

Accepted: 25 May 2023
Published: 13 June 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s00500-023-08665-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Co-occurrence statistics-based global and local feature learning for graph networks

Abstract

Access this article

Similar content being viewed by others

An effective representation learning model for link prediction in heterogeneous information networks

Efficient Network Representations Learning: An Edge-Centric Perspective

Property graph representation learning for node classification

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Co-occurrence statistics-based global and local feature learning for graph networks

Abstract

Access this article

Similar content being viewed by others

An effective representation learning model for link prediction in heterogeneous information networks

Efficient Network Representations Learning: An Edge-Centric Perspective

Property graph representation learning for node classification

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation