Gaussian mixture embedding of multiple node roles in networks

Abstract

Network embedding is a classical topic in network analysis. Current network embedding methods mostly focus on deterministic embedding, which maps each node as a low-dimensional vector. Thus, the network uncertainty and the possible multiple roles of nodes cannot be well expressed. In this paper, we propose to embed a single node as a mixture of Gaussian distribution in a low-dimensional space. Each Gaussian component corresponds to a latent role that the node plays. The proposed approach thus can characterize network nodes in a comprehensive representation, especially bridging nodes, which are relevant to different communities. Experiments on real-world network benchmarks demonstrate the effectiveness of our approach, outperforming the state-of-the-art network embedding methods. Also, we demonstrate that the number of components learned for each node is highly related to its topology features, such as node degree, centrality and clustering coefficient.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

References

  1. 1.

    Ahmed, A., Shervashidze, N., Narayanamurthy, S., Josifovski, V., Smola, A.J.: Distributed large-scale natural graph factorization. In: WWW, pp 37–48. ACM (2013)

  2. 2.

    Akujuobi, U., Yufei, H., Zhang, Q., Zhang, X.: Collaborative graph walk for semi-supervised multi-label node classification. In: ICDM (2019)

  3. 3.

    Athiwaratkun, B., Wilson, A.G.: Multimodal word distributions. In: Conference of the Association for Computational Linguistics (ACL) (2017)

  4. 4.

    Balafar, M.: Gaussian mixture model based segmentation methods for brain mri images. Artif. Intell. Rev. 41(3), 429–439 (2014)

    Article  Google Scholar 

  5. 5.

    Belkin, M., Niyogi, P: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: NIPS, pp 585–591 (2002)

  6. 6.

    Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE TPAMI 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  7. 7.

    Bojchevski, A., Günnemann, S.: Deep Gaussian embedding of attributed graphs: Unsupervised inductive learning via ranking ICLR (2018)

  8. 8.

    Boureau, Y.-l., Cun, Y.L., et al.: Sparse feature learning for deep belief networks. In: NIPS, pp 1185–1192 (2008)

  9. 9.

    Bouveyron, C., Brunet-Saumard, C.: Model-based clustering of high-dimensional data: A review. Comput. Stat. Data Anal. 71, 52–78 (2014)

    MathSciNet  Article  Google Scholar 

  10. 10.

    Breitkreutz, B.-J., Stark, C., Reguly, T., Boucher, L., Breitkreutz, A., Livstone, M., Oughtred, R., Lackner, D.H., Bähler, J., Wood, V., et al.: The biogrid interaction database. Nucleic Acids Res. 36(suppl_1), D637–D640 (2008)

    Google Scholar 

  11. 11.

    Cai, H., Zheng, V.W., Chang, K.: A comprehensive survey of graph embedding: Problems, techniques and applications TKDE (2018)

  12. 12.

    Cao, S., Lu, W., Xu, Q.: Deep neural networks for learning graph representations. In: AAAI, pp 1145–1152 (2016)

  13. 13.

    Chen, X., Qiu, X., Jiang, J., Huang, X.: Gaussian mixture embeddings for multiple word prototypes. arXiv:1511.06246 (2015)

  14. 14.

    Chen, X., Yu, G., Wang, J., Domeniconi, C., Li, Z., Zhang, X.: ActiveHNE: Active heterogeneous network embedding. In: IJCAI (2019)

  15. 15.

    Cui, P., Wang, X., Pei, J., Zhu, W.: A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering (2018)

  16. 16.

    Dos Santos, L., Piwowarski, B., Gallinari, P.: Multilabel classification on heterogeneous graphs with gaussian embeddings. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp 606–622. Springer (2016)

  17. 17.

    Durrieu, J.-L., Thiran, J.-P., Kelly, F.: Lower and upper bounds for approximation of the kullback-leibler divergence between gaussian mixture models. In: ICASSP, pp 4833–4836 (2012)

  18. 18.

    Epasto, A., Perozzi, B.: Is a single embedding enough? Learning node representations that capture multiple social contexts in the Web conference (2019)

  19. 19.

    Gao, X., Carroll, R.J.: Data integration with high dimensionality. Biometrika 104(2), 251–272 (2017)

    MathSciNet  Article  Google Scholar 

  20. 20.

    Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: SIGKDD, pp 855–864. ACM (2016)

  21. 21.

    Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: CVPR, vol. 2, pp 1735–1742. IEEE (2006)

  22. 22.

    Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: NIPS (2017)

  23. 23.

    Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: Methods and applications. arXiv:1709.05584 (2017)

  24. 24.

    He, S., Liu, K., Ji, G., Zhao, J.: Learning to represent knowledge graphs with gaussian embedding. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp 623–632. ACM (2015)

  25. 25.

    Hershey, J.R., Olsen, P.A.: Approximating the Kullback Leibler divergence between gaussian mixture models. ICASSP 4, IV–317–IV–320 (2007)

    Google Scholar 

  26. 26.

    Higuchi, T., Ito, N., Araki, S., Yoshioka, T., Delcroix, M., Nakatani, T.: Online mvdr beamformer based on complex gaussian mixture model with spatial prior for noise robust asr. IEEE/ACM Trans. Audio Speech Language Process. 25(4), 780–793 (2017)

    Article  Google Scholar 

  27. 27.

    Jebara, T., Kondor, R.: Bhattacharyya and expected likelihood kernels. In: Learning Theory and Kernel Machines, pp 57–71. Springer (2003)

  28. 28.

    Jebara, T., Kondor, R.I., Howard, A.: Probability product kernels. JMLR 5, 819–844 (2004)

    MathSciNet  MATH  Google Scholar 

  29. 29.

    Jiang, J., Yang, D., Xiao, Y., Shen, C.: Convolutional Gaussian embeddings for personalized recommendation with uncertainty. In: IJCAI (2019)

  30. 30.

    Knuth, D.E.: The Stanford GraphBase: A Platform for Combinatorial Computing, vol. 37. Addison-Wesley, Reading (1993)

    MATH  Google Scholar 

  31. 31.

    Li, L., Zheng, K., Wang, S., Zhou, X.: Go slow to go fast: Minimal On-road time route scheduling with parking facilities using historical trajectory. VLDB J. 27 (3), 321–345 (2018)

    Article  Google Scholar 

  32. 32.

    Lian, D., Zheng, K., Ge, Y., Cao, L., Chen, E., Xie, X.: GeoMF++: Scalable location recommendation via joint geographical modeling and matrix factorization. ACM Trans. Inf. Syst. 36(3), 33:1–33:29 (2018)

    Article  Google Scholar 

  33. 33.

    LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., Huang, F.: A tutorial on energy-based learning. Predict. Struct. Data 1, 0 (2006)

    Google Scholar 

  34. 34.

    Levy, O., Goldberg, Y., Dagan, I.: Improving distributional similarity with lessons learned from word embeddings. Trans. Assoc. Comput. Linguist. 3, 211–225 (2015)

    Article  Google Scholar 

  35. 35.

    Liang, S., Zhang, X., Ren, Z., Kanoulas, E.: Dynamic embeddings for user profiling in Twitter. In: KDD (2018)

  36. 36.

    Liu, G., Zheng, K., Liu, A., Li, Z., Wang, Y., Zhou, X.: MCS-GPM: Multi-constrained simulation based graph pattern matching in contextual social graphs. TKDE 30(6), 1050–1064 (2018)

    Google Scholar 

  37. 37.

    Liu, X., Murata, T., Kim, K., Kotarasu, C, Zhuang, C: A general view for network embedding as matrix factorization in WSDM (2019)

  38. 38.

    Ma, Y., Ren, Z., Jiang, Z., Tang, J., Yin, D.: Multi-dimensional network embedding with hierarchical structure WSDM (2018)

  39. 39.

    Mahoney, M.: Large text compression benchmark, http://www.mattmahoney.net/text/text.html (2011)

  40. 40.

    Meng, Z., Liang, S., Bao, H., Zhang, X.: Co-embedding attributed networks. In: WSDM (2019)

  41. 41.

    Mueller, J., Thyagarajan, A.: Siamese recurrent architectures for learning sentence similarity. In: AAAI, pp 2786–2792 (2016)

  42. 42.

    Neculoiu, P., Versteegh, M., Rotaru, M.: Learning text similarity with siamese recurrent networks. In: Proceedings of the 1st Workshop on Representation Learning for NLP, pp 148–157 (2016)

  43. 43.

    Paulik, M: Lattice-based training of bottleneck feature extraction neural networks. In: Interspeech, pp 89–93 (2013)

  44. 44.

    Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: SIGKDD, pp 701–710. ACM (2014)

  45. 45.

    Perozzi, B., Kulkarni, V., Chen, H., Skiena, S.: Don’t walk, skip!: Online learning of multi-scale network embeddings. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp 258–265. ACM (2017)

  46. 46.

    Qiu, J., Dong, Y., Ma, H., Li, J., Wang, K., Tang, J.: Network embedding as matrix factorization unifying DeepWalk, LINE, PTE, and node2vec. In: WSDM (2018)

  47. 47.

    Qu, M., Tang, J., Shang, J., Ren, X., Zhang, M., Han, J.: An attention-based collaboration framework for multi-view network representation learning. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp 1767–1776. ACM (2017)

  48. 48.

    Reynolds, D.: Gaussian mixture models. Encycloped. Biom., 827–832 (2015)

  49. 49.

    Ribeiro, L.F., Saverese, P.H., Figueiredo, D.R.: struc2vec: Learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 385–394. ACM (2017)

  50. 50.

    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  51. 51.

    Sun, G., Zhang, X.: A novel framework for node/edge attributed graph embedding. In: PAKDD (2019)

    Google Scholar 

  52. 52.

    Tang, L., Liu, H.: Leveraging social media networks for classification. Data Min. Knowl. Disc. 23(3), 447–478 (2011)

    MathSciNet  Article  Google Scholar 

  53. 53.

    Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: LINE: Large-scale information network embedding. WWW, pp. 1067–1077. [Online]. Available: 1503.03578 (2015)

  54. 54.

    Tang, J., Qu, M., Mei, Q.: Identity-sensitive word embedding through heterogeneous networks. arXiv:1611.09878 (2016)

  55. 55.

    Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: CVPR, pp 1420–1429. IEEE (2016)

  56. 56.

    Tsitsulin, A., Mottin, D., Karras, P., Müller, E.: Verse: Versatile graph embeddings from similarity measures. In: Proceedings of the 2018 World Wide Web Conference, ser WWW, pp 539–548 (2018)

  57. 57.

    Vilnis, L., Mccallum, A.: Word representations via gaussian embedding. In: ICLR, pp 1–12 (2015)

  58. 58.

    Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. IJCAI 2015-Janua, 2111–2117 (2015)

    Google Scholar 

  59. 59.

    Yang, Z., Cohen, W., Salakhutdinov, R.: Revisiting semi-supervised learning with graph embeddings. ICML, vol. 48. [Online]. Available: 1603.08861(2016)

  60. 60.

    Yang, X., Huang, K., Goulermas, J.Y., Zhang, R.: Joint learning of unsupervised dimensionality reduction and gaussian mixture model. Neural. Process. Lett. 45, 791–806 (2017)

    Article  Google Scholar 

  61. 61.

    Yang, R., Shi, J., Xiao, X., Bhowmick, S.S., Yang, Y.J.: Homogeneous network embedding for massive graphs via personalized pagerank. ArXiv (2019)

  62. 62.

    Zhang, M.-L., Zhou, Z.-H.: Ml-knn: A lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)

    Article  Google Scholar 

  63. 63.

    Zhang, C., Woodland, P.: Joint optimisation of tandem systems using gaussian mixture density neural network discriminative sequence training. In: ICASSP, pp 5015–5019. IEEE (2017)

  64. 64.

    Zhang, D., Yin, J., Zhu, X., Zhang, C.: User profile preserving social network embedding. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp 3378–3384. AAAI Press (2017)

  65. 65.

    Zhang, J., Dong, Y., Wang, Y., Tang, J., Ding, M.: ProNE: Fast and scalable network representation learning in IJCAI (2019)

  66. 66.

    Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26(8), 1974–1988 (2014)

    Article  Google Scholar 

  67. 67.

    Zheng, B., Su, H., Hua, W., Zheng, K., Zhou, X., Li, G.: Efficient clue-based route search on road networks. TKDE 29(9), 1846–1859 (2017)

    Google Scholar 

  68. 68.

    Zheng, K., Zhao, Y., Lian, D., Zheng, B., Liu, G., Zhou, X.: Reference-based framework for spatio-temporal trajectory compression and query processing in TKDE (2019)

  69. 69.

    Zhou, X.: Destination-aware task assignment in spatial crowdsourcing: A worker decomposition approach. In: IEEE Trans. Knowl. Data Eng., https://doi.org/10.1109/TKDE.2019.2922604 (2019)

  70. 70.

    Zhu, D., Cui, P., Wang, D., Zhu, W: Deep variational network embedding in Wasserstein space. In: KDD (2018)

Download references

Acknowledgments

This work was partially supported and funded by King Abdullah University of Science and Technology (KAUST), under award number FCC/1/1976-19-01, and NSFC No 61828302, the National Key Research and Development Program of China (2017YFB1002000), Science Technology and Innovation Commission of Shenzhen Municipality (JCYJ20180307123659504), and the State Key Laboratory of Software Development Environment in Beihang University.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Xiangliang Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Graph Data Management in Online Social Networks

Guest Editors: Kai Zheng, Guanfeng Liu, Mehmet A. Orgun, and Junping Du

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Pu, J., Liu, X. et al. Gaussian mixture embedding of multiple node roles in networks. World Wide Web 23, 927–950 (2020). https://doi.org/10.1007/s11280-019-00743-4

Download citation

Keywords

  • Network embedding
  • Gaussian mixture distribution
  • Energy based learning
  • Graph mining