Skip to main content
Log in

Co-occurrence statistics-based global and local feature learning for graph networks

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Prediction tasks over the nodes and edges of the real-world network structure should learn features used by us, which is useful in many tasks such as node classification, link prediction and so on. Recent research in such field of representation learning has significant progress of deep neural networks combined with a different network walking method. However, present feature learning approaches do not pay much attention to get enough global information combined with getting the diversity of connectivity patterns in the network. An algorithmic framework called as Global2vec and Global2vec-PMI of learning vector feature representations for nodes in networks is proposed. With such network embedding method, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of the network neighborhoods information as well as the global statistics information of nodes. The global and local statistics of the nodes are used for modeling the loss function. We demonstrate the efficiency of our methods over existing state-of-the-art techniques for multi-label classification, link prediction in several different real-world networks. The proposed Global2vec can outperform the compared methods in all cases, and Global2vec-PMI outperforms others in most cases of BlogCatalog, PPI and Flickr dataset with Micro-F1 and Macro-F1 score for multi-label classification task. For link prediction task, generally, Global2vec-PMI is better when using Euclidean and Manhattan distance; for other distance metrics, Global2vec can achieve better performance. The maximum area under curve scores of all distance metrics is mostly obtained by the proposed global co-occurrence statistics-based methods. In conclusion, our work represents a very efficient way for learning vector representations of different network structures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  • Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396

    Article  MATH  Google Scholar 

  • Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  • Bonner MF, Epstein RA (2021) Object representations in the human brain reflect the co-occurrence statistics of vision and language. Nat Commun 12(1):4081

    Article  Google Scholar 

  • Cao S, Lu W, Xu Q (2015) Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, ACM, pp 891–900

  • Forcen JI, Pagola M, Barrenechea E, Bustince H (2020) Co-occurrence of deep convolutional features for image search. Image Vis Comput 97:103909

    Article  Google Scholar 

  • Gallagher B, Eliassi-Rad T (2010) Leveraging label-independent features for classification in sparsely labeled networks: An empirical study. In: Advances in social network mining and analysis, Springer, pp 1–19

  • Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl-Based Syst 151:78–94

    Article  Google Scholar 

  • Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 855–864

  • Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1024–1034

  • Henderson K, Gallagher B, Eliassi-Rad T, Tong H, Basu S, Akoglu L, Koutra D, Faloutsos C, Li L (2012) Rolx: structural role extraction & mining in large graphs. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 1231–1239

  • Henderson K, Gallagher B, Li L, Akoglu L, Eliassi-Rad T, Tong H, Faloutsos C (2011) It’s who you know: graph mining using recursive structural features. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 663–671

  • Kim DJ, Sun X, Choi J, Lin S, Kweon IS (2020) Detecting human-object interactions with action co-occurrence priors. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, Springer, pp 718–736

  • Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907

  • Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Transact. Knowl. Discov Data (TKDD) 1(1):2

    Article  Google Scholar 

  • Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. Adv Neural Inf Process Syst 27:2177–2185

    Google Scholar 

  • Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP (2011) Molecular signatures database (msigdb) 3.0. Bioinformatics 27(12):1739–1740

    Article  Google Scholar 

  • Masoumi N, Khajavi R (2023) A fuzzy classifier for evaluation of research topics by using keyword co-occurrence network and sponsors information. Scientometrics pp 1–28

  • Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  • Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119

  • Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  • Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 701–710

  • Sheikh N, Kefato Z, Montresor A (2019) gat2vec. Computing

  • Stark C, Breitkreutz BJ, Chatr-Aryamontri A, Boucher L, Oughtred R, Livstone MS, Nixon J, Van Auken K, Wang X, Shi X (2010) The biogrid interaction database: 2011 update. Nucleic Acids Res 39(suppl–1):D698–D704

    Google Scholar 

  • Tang L, Liu H (2009) Relational learning via latent social dimensions. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 817–826

  • Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on World Wide Web, International World Wide Web conferences steering committee, pp 1067–1077

  • Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous. Min. (IJDWM) 3(3):1–13

    Article  Google Scholar 

  • Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903 1(2)

  • Wang P, Agarwal K, Ham C, Choudhury S, Reddy CK (2021) Self-supervised learning of contextual embeddings for link prediction in heterogeneous networks. Proc. Web Conf. 2021:2946–2957

    Google Scholar 

  • Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 1225–1234

  • Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) Simplifying graph convolutional networks. In: International conference on machine learning, PMLR, pp 6861–6871

  • Xue G, Zhong M, Li J, Chen J, Zhai C, Kong R (2022) Dynamic network embedding survey. Neurocomputing 472:212–223

    Article  Google Scholar 

  • Yang J, Leskovec J (2014) Overlapping communities explain core-periphery organization of networks. Proc IEEE 102(12):1892–1902

    Article  Google Scholar 

  • Yang C, Xiao Y, Zhang Y, Sun Y, Han J (2020) Heterogeneous network representation learning: a unified framework with survey and benchmark

  • Zafarani R, Liu H (2009) Social computing data repository at asu

  • Zhang Y, Gao S, Pei J, Huang H (2022) Improving social network embedding via new second-order continuous graph neural networks. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pp 2515–2523

  • Zhao Z, Zhou H, Li C, Tang J, Zeng Q (2021) Deepemlan: deep embedding learning for attributed networks. Inf Sci 543:382–397

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Fan Ye was supported by the Natural Science Foundation of Anhui Province of China (under grant 1908085MF187) and Key Natural Science Fund of Department of Education of Anhui Province of China (under grant KJ2018A0011).

Funding

This work was supported by the Natural Science Foundation of Anhui Province of China (under grant 1908085MF187) and Key Natural Science Fund of Department of Education of Anhui Province of China (under grant KJ2018A0011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fan Ye.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ye, F. Co-occurrence statistics-based global and local feature learning for graph networks. Soft Comput 27, 11319–11328 (2023). https://doi.org/10.1007/s00500-023-08665-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-08665-0

Keywords

Navigation