Abstract
Mining graph data has become a popular research topic in computer science and has been widely studied in both academia and industry given the increasing amount of network data in the recent years. However, the huge amount of network data has posed great challenges for efficient analysis. This motivates the advent of graph representation which maps the graph into a low-dimension vector space, keeping original graph structure and supporting graph inference. The investigation on efficient representation of a graph has profound theoretical significance and important realistic meaning, we therefore introduce some basic ideas in graph representation/network embedding as well as some representative models in this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Negative links exist in signed network, but only non-negative links are considered here.
- 2.
Here the authors use sigmoid function \(\sigma (x)=\frac{1}{1+exp(-x)}\) as the non-linear activation function.
- 3.
To simplify the notations, network representations \(Y^{(K)}=\{\mathbf {y}^{(K)}_i\}_{i=1}^n\) are denoted as \(Y=\{\mathbf {y}_i\}_{i=1}^n\) by the authors.
- 4.
When the covariance matrices are not diagonal, Wang et al. propose a fast iterative algorithm (i.e., BADMM) to solve the Wasserstein distance [70].
- 5.
Note that the first term \(p(\{\mathbf {f}(v): v\in V^{*} \} \mid \{\mathbf {h}(v): v\in V^{*} \})\) is maximized with \(\mathbf {f}(v) = \mathbf {g}(\mathbf {h}(v))\), and the maximum value of this probability density is a constant unrelated with \(\mathbf {h}(v)\). Hence we can focus on maximizing the second term first.
References
Agarwal, S., Branson, K., Belongie S.: Higher order learning with graphs. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 17–24. ACM (2006)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows: in Metric Spaces and in the Space of Probability Measures. Springer Science & Business Media (2008)
Ba, J.L. Kiros, J.R., Hinton, G.E.: Layer normalization (2016). arXiv preprint arXiv:1607.06450
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)
Bengio, Y.: Learning deep architectures for AI. Found. Trends® Mach. Learn. 2(1), 1–127 (2009)
Bonacich, P.: Some unique properties of eigenvector centrality. Soc. Netw. 29(4), 555–564 (2007)
Bonneel, N., Rabin, J., Peyré, G., Pfister, H.: Sliced and radon wasserstein barycenters of measures. J. Math. Imaging Vis. 51(1), 22–45 (2015)
Bonneel, N., Van De Panne, M., Paris, S., Heidrich, W.: Displacement interpolation using lagrangian mass transport. ACM Trans. Graph. (TOG) 30, 158. ACM (2011)
Bryant, V.: Metric Spaces: Iteration and Application. Cambridge University Press (1985)
Cao, S., Lu, W., Xu, Q.: Grarep: learning graph representations with global structural information. In: CIKM ’15, pp. 891–900. ACM, New York (2015)
Chen, C., Tong, H.: Fast eigen-functions tracking on dynamic graphs. In: Proceedings of the 2015 SIAM International Conference on Data Mining, pp. 559–567. SIAM (2015)
Clement, P., Desch, W.: An elementary proof of the triangle inequality for the wasserstein metric. Proc. Am. Math. Soc. 136(1), 333–339 (2008)
Courty, N., Flamary, R., Ducoffe, M.: Learning wasserstein embeddings (2017). arXiv preprint arXiv:1710.07457
Courty, N., Flamary, R., Tuia, D., Rakotomamonjy, A.: Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1853–1865 (2017)
Cuturi, M., Doucet, A.: Fast computation of wasserstein barycenters. In: International Conference on Machine Learning, pp. 685–693 (2014)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signal Syst. 2(4), 303–314 (1989)
Dash, N.S.: Context and contextual word meaning. SKASE J. Theor. Linguist. 5(2), 21–31 (2008)
De Goes, F., Breeden, K., Ostromoukhov, V., Desbrun, M.: Blue noise through optimal transport. ACM Trans. Graph. (TOG) 31(6), 171 (2012)
Delalleau, O., Bengio, Y., Roux, N.L.: Efficient non-parametric function induction in semi-supervised learning. In: AISTATS ’05, pp. 96–103 (2005)
Doersch, C.: Tutorial on variational autoencoders (2016). arXiv preprint arXiv:1606.05908
Dreyfus, S.: The numerical solution of variational problems. J. Math. Anal. Appl. 5(1), 30–45 (1962)
Eom, Y.-H., Jo, H.-H.: Tail-scope: using friends to estimate heavy tails of degree distributions in large-scale complex networks. Sci. Rep. 5 (2015)
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)
Givens, C.R., Shortt, R.M., et al.: A class of wasserstein metrics for probability distributions. Mich. Math. J. 31(2), 231–240 (1984)
Glorot, X., Bordes, A., Bengio,Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)
Grover, A., Leskovec, J.: Node2vec: scalable feature learning for networks. In: KDD ’16, pp. 855–864. ACM, New York (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). arXiv preprint arXiv:1512.03385
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Holland, P.W., Leinhardt, S.: Holland and Leinhardt reply: some evidence on the transitivity of positive interpersonal sentiment (1972)
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)
Jamali, M., Ester, M.: A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 135–142. ACM (2010)
Jin, E.M., Girvan, M., Newman, M.E.: Structure of growing social networks. Phys. Rev. E 64(4), 046132 (2001)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
Kolouri, S., Park, S.R., Thorpe, M., Slepcev, D., Rohde, G.K.: Optimal mass transport: signal processing and machine-learning applications. IEEE Signal Process. Mag. 34(4), 43–59 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., Huang, F.: A tutorial on energy-based learning. Predict. Struct. Data 1 (2006)
Leicht, E.A., Holme, P., Newman, M.E.: Vertex similarity in networks. Phys. Rev. E 73(2), 026120 (2006)
Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)
Luo, D., Nie, F., Huang, H., Ding, C.H.: Cauchy graph embedding. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 553–560 (2011)
Ma, J., Cui, P., Zhu, W.: Depthlgp: learning embeddings of out-of-sample nodes in dynamic networks. In: AAAI, pp. 370–377 (2018)
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association, pp. 1045–1048 (2010)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Nathan, E., Bader, D.A.: A dynamic algorithm for updating katz centrality in graphs. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 149–154. ACM (2017)
Ou, M., Cui, P., Pei, J., Zhang, Z., Zhu, W.: Asymmetric transitivity preserving graph embedding. In: Proceedings of ACM SIGKDD, pp. 1105–1114 (2016)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab (1999)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: SIGKDD, pp. 701–710. ACM (2014)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press (2005)
Rossi, R.A., Ahmed, N.K.: Role discovery in networks. IEEE Trans. Knowl. Data Eng. 27(4), 1112–1131 (2015)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Neurocomputing: foundations of research. In: Learning Representations by Back-Propagating Errors, pp. 696–699. MIT Press, Cambridge (1988)
Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969–978 (2009)
Shaw, B., Jebara, T.: Structure preserving embedding. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 937–944. ACM (2009)
Siegelmann, H.T., Sontag, E.D.: On the computational power of neural nets. J. Comput. Syst. Sci. 50(1), 132–150 (1995)
Smola, A.J., Kondor, R.: Kernels and Regularization on Graphs, pp. 144–158. Springer, Berlin (2003)
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, pp. 1642. Citeseer (2013)
Stoyanov, V., Ropson, A., Eisner, J.: Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In: AISTATS’11, Fort Lauderdale, April 2011
Sun, L., Ji, S., Ye, J.: Hypergraph spectral learning for multi-label classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 668–676. ACM (2008)
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077. International World Wide Web Conferences Steering Committee (2015)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Tian, F., Gao, B., Cui, Q., Chen, E., Liu, T.-Y.: Learning deep representations for graph clustering. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1293–1299 (2014)
Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders (2017). arXiv preprint arXiv:1711.01558
Tu, K., Cui, P., Wang, X., Wang, F., Zhu, W.: Structural deep embedding for hyper-networks. In: Thirty-Second AAAI Conference on Artificial Intelligence, pp. 426–433 (2018)
Tu, K., Cui, P., Wang, X., Yu, P.S., Zhu, W.: Deep recursive network embedding with regular equivalence. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2357–2366. ACM (2018)
Vilnis, L., McCallum, A.: Word representations via Gaussian embedding (2014). arXiv preprint arXiv:1412.6623
Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11, 1201–1242 (2010)
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234. ACM (2016)
Wang, H., Banerjee, A.: Bregman alternating direction method of multipliers. In: Advances in Neural Information Processing Systems, pp. 2816–2824 (2014)
Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)
Zang, C., Cui, P., Faloutsos, P., Zhu, W.: Long short memory process: modeling growth dynamics of microscopic social connectivity. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 565–574. ACM (2017)
Zhu, D., Cui, P., Wang, D., Zhu, W.: Deep variational network embedding in wasserstein space. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2827–2836. ACM (2018)
Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical report (2002)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: ICML’03, pp. 912–919. AAAI Press (2003)
Zhuang, J., Tsang, I.W., Hoi S.: Two-layer multiple kernel learning. In: International Conference on Artificial Intelligence and Statistics, pp. 909–917 (2011)
Acknowledgements
We thank Ke Tu (DRNE and DHNE), Daixin Wang (SDNE), Dingyuan Zhu (DVNE) and Jianxin Ma (DepthLGP) for providing us with valuable materials. Xin Wang is the corresponding author. This work is supported by China Postdoctoral Science Foundation No. BX201700136, National Natural Science Foundation of China Major Project No. U1611461 and National Program on Key Basic Research Project No. 2015CB352300.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Zhu, W., Wang, X., Cui, P. (2020). Deep Learning for Learning Graph Representations. In: Pedrycz, W., Chen, SM. (eds) Deep Learning: Concepts and Architectures. Studies in Computational Intelligence, vol 866. Springer, Cham. https://doi.org/10.1007/978-3-030-31756-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-31756-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31755-3
Online ISBN: 978-3-030-31756-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)